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POLYKETIDE SYNTHASE ENZYMES AND RECOMBINANT DNA 
CONSTRUCTS THEREFOR 



5 Cross-Reference to Related Applications 

The present application claims priority to related US. patent application Serial 
Nos. 60/102,748, filed 2 Oct 1998; 60/139,650, filed 17 June 1999; and 6*0/123,810, filed 
1 1 Mar. 1999, each of which is incorporated herein by reference. 

10 Field of the Invention 

The present invention relates to polyketides and the polyketide synthase (PKS) 
enzymes that produce them. The invention also relates generally to genes encoding PKS 
enzymes and to recombinant host cells containing such genes and in which expression of 
such genes leads to the production of polyketides. The present invention also relates to 

1 5 compounds useful as medicaments having immunosuppressive and/or neurotrophic 
activity. Thus, the invention relates to the fields of chemistry, molecular biology, and 
agricultural, medical, and veterinary technology. 



Background of the Invention 

20 Polyketides are a class of compounds synthesized from 2-carbon units through a 

series of condensations and subsequent modifications. Polyketides occur in many types of 
organisms, including fungi and mycelial bacteria, in particular, the actinomycetes. 
Polyketides are biologically active molecules with a wide variety of structures, and the 
class encompasses numerous compounds with diverse activities. Tetracycline, 

25 erythromycin, epothilone, FK-506, FK-520, narbomycin, picromycin, rapamycin, 
spinocyn, and tylosin are examples of polyketides. Given the difficulty in producing 
polyketide compounds by traditional chemical methodology, and the typically low 
production of polyketides in wild-type cells, there has been considerable interest in 
finding improved or alternate means to produce polyketide compounds. 
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This interest has resulted in the cloning, analysis, and manipulation by 
recombinant DNA technology of genes that encode PKS enzymes. The resulting 
technology allows one to manipulate a known PKS gene cluster either to produce the 
polyketide synthesized by that PKS at higher levels than occur in nature or in hosts that 
5 otherwise do not produce the polyketide. The technology also allows one to produce 

molecules that are structurally related to, but distinct from, the polyketides produced from 
known PKS gene clusters. See, e.g., PCT publication Nos. WO 93/13663; 95/08548; 
96/40968; 97/02358; 98/27203; and 98/49315; United States Patent Nos. 4,874,748; 
5,063,155; 5,098,837; 5,149,639; 5,672,491; 5,712,146; 5,830,750; and 5,843,718; and 

10 Fu et al, 1994, Biochemistry 33: 9321-9326; McDaniel et a/., 1993, Science 262: 1546- 
1550; and Rohr, 1995, Angew. Chem. Int. Ed. Engl 34{%): 881-888, each of which is 
incorporated herein by reference. 

Polyketides are synthesized in nature by .PKS enzymes. These enzymes, which are 
complexes of multiple large proteins, are similar to the synthases that catalyze 

15 condensation of 2-carbon units in the biosynthesis of fatty acids. PKSs catalyze the 
biosynthesis of polyketides through repeated, decarboxylative Claisen condensations 
between acylthioester building blocks. The building blocks used to form complex 
polyketides are typically acylthioesters, such as acetyl, butyryl, propionyl, malonyl, 
hydroxymalonyl, methylmalonyl, and ethylmalonyl CoA. Other building blocks include 

20 amino acid like acylthioesters. PKS enzymes that incorporate such building blocks 

include an activity that functions as an amino acid ligase (an AMP ligase) or as a non- 
ribosomal peptide synthetase (NRPS). Two major types of PKS enzymes are known; 
these differ in their composition and mode of synthesis of the polyketide synthesized. 
These two major types of PKS enzymes are commonly referred to as Type I or "modular" 

25 and Type II "iterative" PKS enzymes. 

In the Type I or modular PKS enzyme group, a set of separate catalytic active 
sites (each active site is termed a "domain", and a set thereof is termed a "module") exists 
for each cycle of carbon chain elongation and modification in the polyketide synthesis 
pathway. The typical modular PKS is composed of several large polypeptides, which can 

30 be segregated from amino to carboxy termini into a loading module, multiple extender 
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modules, and a releasing (or thioesterase) domain. The PKS enzyme known as 6- 
deoxyerythronolide B synthase (DEBS) is a Type I PKS. In DEBS, there is a loading 
module, six extender modules, and a thioesterase (TE) domain. The loading module, six 
extender modules, and TE of DEBS are present on three separate proteins (designated 
DEBS-1, DEBS-2, and DEBS-3, with two extender modules per protein). Each of the 
DEBS polypeptides is encoded by a separate open reading frame (ORF) or gene; these 
genes are known as eryAI, eryAII, and eryAIII. See Caffrey et al., 1992, FEBS Letters 
304: 205, and U.S. Patent No. 5,824,513, each of which is incorporated herein by 
reference. 

Generally, the loading module is responsible for binding the first building block 
used to synthesize the polyketide and transferring it to the first extender module. The 
loading module of DEBS consists of an acyltransferase (AT) domain and an acyl carrier 
protein (ACP) domain. Another type of loading module utilizes an inactivated 
ketosynthase (KS) domain and AT and ACP domains. This inactivated KS is in some 
instances called KS Q , where the superscript letter is the abbreviation for the amino acid, 
glutamine, that is present instead of the active site cysteine required for ketosynthase 
activity. In other PKS enzymes, including the FK-506 PKS, the loading module 
incorporates an unusual starter unit and is composed of a CoA ligase like activity domain. 
In any event, the loading module recognizes a particular acyl-CoA (usually acetyl or 
propionyl but sometimes butyryl or other acyl-CoA) and transfers it as a thiol ester to the 
ACP of the loading module. 

The AT on each of the extender modules recognizes a particular extender-Co A 
(malonyl or alpha-substituted malonyl, i.e., methylmalonyl, ethylmalonyl, and 2- 
hydroxymalonyl) and transfers it to the ACP of that extender module to form a thioester. 
Each extender module is responsible for accepting a compound from a prior module, 
binding a building block, attaching the building block to the compound from the prior 
module, optionally performing one or more additional functions, and transferring the 
resulting compound to the next module. 

Each extender module of a modular PKS contains a KS, AT, ACP, and zero, one, 
two, or three domains that modify the beta-carbon of the growing polyketide chain. A 
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typical (non-loading) minimal Type I PKS extender module is exemplified by extender 
module three of DEBS, which contains a KS domain, an AT domain, and an ACP 
domain. These three domains are sufficient to activate a 2-carbon extender unit and attach 
it to the growing polyketide molecule. The next extender module, in turn, is responsible 
for attaching the next building block and transferring the growing compound to the next 
extender module until synthesis is complete. 

Once the PKS is primed with acyl- and malonyl-ACPs, the acyl group of the 
loading module is transferred to form a thiol ester (trans-esterification) at the KS of the 
first extender module; at this stage, extender module one possesses an acyl-KS and a 
malonyl (or substituted malonyl) ACP. The acyl group derived from the loading module 
is then covalently attached to the alpha-carbon of the malonyl group to form a carbon- 
carbon bond, driven by concomitant decarboxylation, and generating a new acyl-ACP 
that has a backbone two carbons longer than the loading building block (elongation or 
extension). 

The polyketide chain, growing by two carbons each extender module, is 
sequentially passed as covalently bound thiol esters from extender module to extender 
module, in an assembly line-like process. The carbon chain produced by this process 
alone would possess a ketone at every other carbon atom, producing a polyketone, from 
which the name polyketide arises. Most commonly, however, additional enzymatic 
activities modify the beta keto group of each two carbon unit just after it has been added 
to the growing polyketide chain but before it is transferred to the next module. 

Thus, in addition to the minimal module containing KS, AT, and ACP domains 
necessary to form the carbon-carbon bond, and as noted above, other domains that 
modify the beta-carbonyl moiety can be present. Thus, modules may contain a 
ketoreductase (KR) domain that reduces the keto group to an alcohol. Modules may also 
contain a KR domain plus a dehydratase (DH) domain that dehydrates the alcohol to a 
double bond. Modules may also contain a KR domain, a DH domain, and an 
enoylreductase (ER) domain that converts the double bond product to a saturated single 
bond using the beta carbon as a methylene function. An extender module can also contain 
other enzymatic activities, such as, for example, a methylase or dimethylase activity. 
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After traversing the final extender module, the polyketide encounters a releasing 
domain that cleaves the polyketide from the PKS and typically cyclizes the polyketide. 
For example, final synthesis of 6-dEB is regulated by a TE domain located at the end of 
extender module six. In the synthesis of 6-dEB, the TE domain catalyzes cyclization of 
the macrolide ring by formation of an ester linkage. In FK-506, FK-520, rapamycin, and 
similar polyketides, the TE activity is replaced by a RapP (for rapamycin) or RapP like 
activity that makes a linkage incorporating a pipecolate acid residue. The enzymatic 
activity that catalyzes this incorporation for the rapamycin enzyme is known as RapP, 
encoded by the rapP gene. The polyketide can be modified further by tailoring enzymes; 
these enzymes add carbohydrate groups or methyl groups, or make other modifications, 
i.e., oxidation or reduction, on the polyketide core molecule. For example, 6-dEB is 
hydroxylated at C-6 and C- 12 and glycosylated at C-3 and C-5 in the synthesis of 
erythromycin A. 

In Type I PKS polypeptides, the order of catalytic domains is conserved. When all 
beta-keto processing domains are present in a module, the order of domains in that 
module from N-to-C-terminus is always KS, AT, DH, ER, KR, and ACP. Some or all of 
the beta-keto processing domains may be missing in particular modules, but the order of 
the domains present in a module remains the same. The order of domains within modules 
is believed to be important for proper folding of the PKS polypetides into an active 
complex. Importantly, there is considerable flexibility in PKS enzymes, which allows for 
the genetic engineering of novel catalytic complexes. The engineering of these enzymes 
is achieved by modifying, adding, or deleting domains, or replacing them with those 
taken from other Type I PKS enzymes. It is also achieved by deleting, replacing, or 
adding entire modules with those taken from other sources. A genetically engineered 
PKS complex should of course have the ability to catalyze the synthesis of the product 
predicted from the genetic alterations made. 

Alignments of the many available amino acid sequences for Type I PKS enzymes 
has approximately defined the boundaries of the various catalytic domains. Sequence 
alignments also have revealed linker regions between the catalytic domains and at the N- 
and C-termini of individual polypeptides. The sequences of these linker regions are less 
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well conserved than are those for the catalytic domains, which is in part how linker 
regions are identified. Linker regions can be important for proper association between 
domains and between the individual polypeptides that comprise the PKS complex. One 
can thus view the linkers and domains together as creating a scaffold on which the 
domains and modules are positioned in the correct orientation to be active. This 
organization and positioning, if retained, permits PKS domains of different or identical 
substrate specificities to be substituted (usually at the DNA level) between PKS enzymes 
by various available methodologies. In selecting the boundaries of, for example, an AT 
replacement, one can thus make the replacement so as to retain the linkers of the recipient 
PKS or to replace them with the linkers of the donor PKS AT domain, or, preferably, 
make both constructs to ensure that the correct linker regions between the KS and AT 
domains have been included in at least one of the engineered enzymes. Thus, there is 
considerable flexibility in the design of new PKS enzymes with the result that known 
polyketides can be produced more effectively, and novel polyketides useful as 
pharmaceuticals or for other purposes can be made. 

By appropriate application of recombinant DNA technology, a wide variety of 
polyketides can be prepared in a variety of different host cells provided one has access to 
nucleic acid compounds that encode PKS proteins and polyketide modification enzymes. 
The present invention helps meet the need for such nucleic acid compounds by providing 
recombinant vectors that encode the FK-520 PKS enzyme and various FK-520 
modification enzymes. Moreover, while the FK-506 and FK-520 polyketides have many 
useful activities, there remains a need for compounds with similar useful activities but 
with better pharmacokinetic profile and metabolism and fewer side-effects. The present 
invention helps meet the need for such compounds as well. 

Summary of the Invention 
In one embodiment, the present invention provides recombinant DNA vectors that 
encode all or part of the FK-520 PKS enzyme. Illustrative vectors of the invention 
include cosmid pKOS034-120, pKOS034-124, pKOS065-C31, pKOS065-C3 5 pKOS065- 
M27, and pKOS065-M21. The invention also provides nucleic acid compounds that 
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encode the various domains of the FK-520 PKS, i.e., the KS, AT, ACP, KR, DH, and ER 
domains. These compounds can be readily used, alone or in combination with nucleic 
acids encoding other FK-520 or non-FK-520 PKS domains, as intermediates in the 
construction of recombinant vectors that encode all or part of PKS enzymes that make 
novel polyketides. 

The invention also provides isolated nucleic acids that encode all or part of one or 
more modules of the FK-520 PKS, each module comprising a ketosynihase activity, an 
acyl transferase activity, and an acyl carrier protein activity. The invention provides an 
isolated nucleic acid that encodes one or more open reading frames of FK-520 PKS 
genes, said open reading frames comprising coding sequences for a CoA ligase activity, 
an NRPS activity, or two or more extender modules. The invention also provides 
recombinant expression vectors containing these nucleic acids. 

In another embodiment, the invention provides isolated nucleic acids that encode 
all or a part of a PKS that contains at least one module in which at least one of the 
domains in the module is a domain from a non-FK-520 PKS and at least one domain is 
from the FK-520 PKS. The non-FK-520 PKS domain or module originates from the 
rapamycin PKS, the FK-506 PKS, DEBS, or another PKS. The invention also provides 
recombinant expression vectors containing these nucleic acids. 

In another embodiment, the invention provides a method of preparing a 
polyketide, said method comprising transforming a host cell with a recombinant DNA 
vector that encodes at least one module of a PKS, said module comprising at least one 
FK-520 PKS domain, and culturing said host cell under conditions such that said PKS is 
produced and catalyzes synthesis of said polyketide. In one aspect, the method is 
practiced with a Streptomyces host cell. In another aspect, the polyketide produced is FK- 
520. In another aspect, the polyketide produced is a polyketide related in structure to FK- 
520. In another aspect, the polyketide produced is a polyketide related in structure to FK- 
506 or rapamycin. 

In another embodiment, the invention provides a set of genes in recombinant form 
sufficient for the synthesis of ethylmalonyl CoA in a heterologous host cell. These genes 
and the methods of the invention enable one to create recombinant host cells with the 
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ability to produce polyketides or other compounds that require ethylmalonyl CoA for 
biosynthesis. The invention also provides recombinant nucleic acids that encode AT 
domains specific for ethylmalonyl CoA. Thus, the compounds of the invention can be 
used to produce polyketides requiring ethylmalonyl CoA in host cells that otherwise are 
unable to produce such polyketides. 

In another embodiment, the invention provides a set of genes in recombinant form 
sufficient for the synthesis of 2-hydroxymalonyl CoA and 2-methoxymalonyl CoA in a 
heterologous host cell. These genes and the methods of the invention enable one to create 
recombinant host cells with the ability to produce polyketides or other compounds that 
require 2-hydroxymalonyl CoA for biosynthesis. The invention also provides 
recombinant nucleic acids that encode AT domains specific for 2-hydroxymalonyl CoA 
and 2-methoxymalonyl CoA. Thus, the compounds of the invention can be used to 
produce polyketides requiring 2-hydroxymalonyl CoA or 2-methoxymalonyl CoA in host 
cells that are otherwise unable to produce such polyketides. 

In another embodiment, the invention provides a compound related in structure to 
FK-520 or FK-506 that is useful in the treatment of a medical condition. These 
compounds include compounds in which the C-13 methoxy group is replaced by a moiety 
selected from the group consisting of hydrogen, methyl, and ethyl moieties. Such 
compounds are less susceptible to the main in vivo pathway of degradation for FK-520 
and FK-506 and related compounds and thus exhibit an improved pharmacokinetic 
profile. The compounds of the invention also include compounds in which the C-15 
methoxy group is replaced by a moiety selected from the group consisting of hydrogen, 
methyl, and ethyl moieties. The compounds of the invention also include the above 
compounds further modified by chemical methodology to produce derivatives such as, 
but not limited to, the C-18 hydroxyl derivatives, which have potent neurotrophin but not 
immunosuppresion activities. 
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wherein, R, is hydrogen, methyl, ethyl, or allyl; R 2 is hydrogen or hydroxyl, provided 
that when R 2 is hydrogen, there is a double bond between C-20 and C-19; R 3 is hydrogen 
or hydroxyl; R4 is methoxyl, hydrogen, methyl, or ethyl; and R 5 is methoxyl, hydrogen, 
methyl, or ethyl; but not including FK-506, FK-520, 18-hydroxy-FK-520, and 18- 
hydroxy-FK-506. The invention provides these compounds in purified form and in 
pharmaceutical compositions. 

In another embodiment, the invention provides a method for treating a medical 
condition by administering a pharmaceutically efficacious dose of a compound of the 
invention. The compounds of the invention may be administered to achieve 
immunosuppression or to stimulate nerve growth and regeneration. 

These and other embodiments and aspects of the invention will be more fully 
understood after consideration of the attached Drawings and their brief description below, 
together with the detailed description, examples, and claims that follow. 

Brief Description of the Drawings 
Figure 1 shows a diagram of the FK-520 biosynthetic gene cluster. The top line 
provides a scale in kilobase pairs (kb). The second line shows a restriction map with 
selected restriction enzyme recognition sequences indicated. K is Kpnl; X is Xhol, S is 
Sad; P is Pstl; and E is £coRI. The third line indicates the position of FK-520 PKS and 
related genes. Genes are abbreviated with a one letter designation, i.e., C is fkbC. 
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Immediately under the third line are numbered segments showing where the loading 
module (L) and ten different extender modules (numbered 1 - 10) are encoded on the 
various genes shown. At the bottom of the Figure, the DNA inserts of various cosmids of 
the invention (i.e., 34-124 is cosmid pKOS034-124) are shown in alignment with the FK- 
5 520 biosynthetic gene cluster. 

Figure 2 shows the loading module (load), the ten extender modules, and the 
peptide synthetase domain of the FK-520 PKS, together with, on the top line, the genes 
that encode the various domains and modules. Also shown are the various intermediates 
in FK-520 biosynthesis, as well as the structure of FK-520, with carbons 13, 15, 21, and 

10 31 numbered. The various domains of each module and subdomains of the loading 

module are also shown. The darkened circles showing the DH domains in modules 2, 3, 
and 4 indicate that the dehydratase domain is not functional as a dehydratase; this domain 
may affect the stereochemistry at the corresponding position in the polykeiide. The 
substituents on the FK-520 structure that result from the action of non-PKS enzymes are 

1 5 also indicated by arrows, together with the types of enzymes or the genes that code for 
the enzymes that mediate the action. Although the methyltransferase is shown acting at 
the C-13 and C-15 hydroxyl groups after release of the polyketide from the PKS, the 
methyltransferase may act on the 2-hydroxymalonyl substrate prior to or 
contemporaneously with its incorporation during polyketide synthesis. 

20 Figure 3 shows a close-up view of the left end of the FK-520 gene cluster, which 

contains at least ten additional genes. The ethyl side chain on carbon 21 of FK-520 
(Figure 2) is derived from an ethylmalonyl CoA extender unit that is incorporated by an 
ethylmalonyl specific AT domain in extender module 4 of the PKS. At least four of the 
genes in this region code for enzymes involved in ethylmalonyl biosynthesis. The 

25 polyhydroxybutyrate depolymerase is involved in maintaining hydroxybutyryl-CoA 
pools during FK-520 production. Polyhydroxybutyrate accumulates during vegetative 
growth and disappears during stationary phase in other Streptomyces (Ranade and 
Vining, 1993, Can. J. Microbiol 59:377). Open reading frames with unknown function 
are indicated with a question mark. 
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Figure 4 shows a biosynthetic pathway for the biosynthesis of ethylmalonyl CoA 
from acetoacetyl CoA consistent with the function assigned to four of the genes in the 
FK-520 gene cluster shown in Figure 3. 

Figure 5 shows a close-up view of the right-end of the FK-520 PKS gene cluster 
(and of the sequences on cosmid pKOS065-C31). The genes shown include fkbD.fkbM 
(a methyl transferase that methylates the hydroxyl group on C-3 1 of FK-520% fkbN (a 
homolog of a gene described as a regulator of cholesterol oxidase and that is believed to 
be a transcriptional activator), fkbQ (a type II thioesterase, which can increase polyketide 
production levels), and fkbS (a crotonyl-CoA reductase involved in the biosynthesis of 
ethylmalonyl CoA), 

Figure 6 shows the proposed degradative pathway for tacrolimus (FK-506) 
metabolism. 

Figure 7 shows a schematic process for the construction of recombinant PKS 
genes of the invention that encode PKS enzymes that produce 13-desmethoxy FK-506 
and FK-520 polyketides of the invention, as described in Example 4, below. 

Figure 8, in Parts A and B, shows certain compounds of the invention preferred 
for dermal application in Part A and a synthetic route for making those compounds in 
PartB. 

Detailed Description of the Invention 
Given the valuable pharmaceutical properties of polyketides, there is a need for 
methods and reagents for producing large quantities of polyketides, as well as for 
producing related compounds not found in nature. The present invention provides such 
methods and reagents, with particular application to methods and reagents for producing 
the polyketides known as FK-520, also known as ascomycin or L-683,590 (see Holt et 
aU 1993, JACS 775:9925), and FK-506, also known as tacrolimus. Tacrolimus is a 
macrolide immunosuppressant used to prevent or treat rejection of transplanted heart, 
kidney, liver, lung, pancreas, and small bowel allografts. The drug is also useful for the 
prevention and treatment of graft- versus-host disease in patients receiving bone marrow 
transplants, and for the treatment of severe, refractory uveitis. There have been additional 
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reports of the unapproved use of tacrolimus for other conditions, including alopecia 
universalis, autoimmune chronic active hepatitis, inflammatory bowel disease, multiple 
sclerosis, primary biliary cirrhosis, and scleroderma. The invention provides methods and 
reagents for making novel polyketides related in structure to FK-520 and FK-506. and 
structurally related polyketides such as rapamycin. 

The FK-506 and rapamycin polyketides are potent immunosuppressants, with 
chemical structures shown below. 




FK-506 Rapamycin 

FK-520 differs from FK-506 in that it lacks the allyl group at C-21 of FK-506, having 
instead an ethyl group at that position, and has similar activity to FK-506, albeit reduced 
immunosuppressive activity. 

These compounds act through initial formation of an intermediate complex with 
protein immunophilins" known as FKBPs (FK-506 binding proteins), including FKBP- 
12. Immunophilins are a class of cytosolic proteins that form complexes with molecules 
such as FK-506, FK-520, and rapamycin that in turn serve as ligands for other cellular 
targets involved in signal transduction. Binding of FK-506, FK-520, and rapamycin to 
FKBP occurs through the structurally similar segments of the polyketide molecules, 
known as the "FKBP-binding domain" (as generally but not precisely indicated by the 
stippled regions in the structures above). The FK-506-FKBP complex then binds 
calcineurin, while the rapamycin-FKBP complex binds to a protein known as RAFT-1. 
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Binding of the FKBP-polyketide complex to these second proteins occurs through the 
dissimilar regions of the drugs known as the "effector ' domains. 



The three component FKBP-polyketide-effector complex is required for signal 
transduction and subsequent immunosuppressive activity of FK-506, FK-520, and 
rapamycin. Modifications in the effector domains of FK-506, FK-520, and rapamycin 
that destroy binding to the effector proteins (calcineurin or RAFT) lead to loss of 

1 0 immunosuppressive activity, even though FKBP binding is unaffected. Further, such 
analogs antagonize the immunosuppressive effects of the parent polyketides, because 
they compete for FKBP. Such non-immunosuppressive analogs also show reduced 
toxicity (see Dumont et a/., 1 992, Journal of Experimental Medicine 1 76, 75 1 -760), 
indicating that much of the toxicity of these drugs is not linked to FKBP binding. 

1 5 In addition to immunosuppressive activity, FK-520, FK-506, and rapamycin have 

neurotrophic activity. In the central nervous system and in peripheral nerves, 
immunophilins are referred to as "neuroimmunophilins". The neuroimmunophilin FKBP 
is markedly enriched in the central nervous system and in peripheral nerves. Molecules 
that bind to the neuroimmunophilin FKBP, such as FK-506 and FK-520, have the 

20 remarkable effect of stimulating nerve growth. In vitro, they act as neurotrophic, i.e., 




Immunosuppression 



5 
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they promote neurite outgrowth in NGF-treated PC 12 cells and in sensory neuronal 
cultures, and in intact animals, they promote regrowth of damaged facial and sciatic 
nerves, and repair lesioned serotonin and dopamine neurons in the brain. See Gold et aL. 
Jun. 1999, J. Pharm. Exp. Ther, 289(3): 1202-1210; Lyons et aU 1994, Proc. National 
5 Academy of Science 91: 3191-3195; Gold et al, 1995, Journal of Neuroscience 15: 7509- 
75 1 6; and Steiner et al, 1 997, Proc. National Academy of Science 94: 20 1 9-2024. 
Further, the restored central and peripheral neurons appear to be functional. 

Compared to protein neurotrophic molecules (BNDF, NGF, etc.), the small- 
molecule neurotrophins such as FK-506, FK-520, and rapamycin have different, and 

10 often advantageous, properties. First, whereas protein neurotrophins are difficult to 

deliver to their intended site of action and may require intra-cranial injection, the small- 
molecule neurotrophins display excellent bioavailability; they are active when 
administered subcutaneously and orally. Second, whereas protein neurotrophins show 
quite specific effects, the small-molecule neurotrophins show rather broad effects. 

1 5 Finally, whereas protein neurotrophins often show effects on normal sensory nerves, the 
small-molecule neurotrophins do not induce aberrant sprouting of normal neuronal 
processes and seem to affect damaged nerves specifically. Neuroimmunophilin ligands 
have potential therapeutic utility in a variety of disorders involving nerve degeneration 
(e.g. multiple sclerosis, Parkinson's disease, Alzheimer's disease, stroke, traumatic spinal 

20 cord and brain injury, peripheral neuropathies). 

Recent studies have shown that the immunosuppressive and neurite outgrowth 
activity of FK-506, FK-520, and rapamycin can be separated; the neuroregenerative 
activity in the absence of immunosuppressive activity is retained by agents which bind to 
FKBP but not to the effector proteins calcineurin or RAFT. See Steiner et al. % 1997, 

25 Nature Medicine 3: All -428. 




Nerve Regeneration 
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Available structure-activity data show that the important features for neurotrophic 
activity of rapamycin, FK-520, and FK-506 lie within the common, contiguous segments 
of the macrolide ring that bind to FKBP. This portion of the molecule is termed the 
"FKBP binding domain" (see VanDuyne et al., 1993, Journal of Molecular Biology 229: 
5 105-124.). Nevertheless, the effector domains of the parent macrolides contribute to 
conformational rigidity of the binding domain and thus indirectly contribute to FKBP 
binding. 



There are a number of other reported analogs of FK-506, FK-520, and rapamycin that 
10 bind to FKBP but not the effector protein calcineurin or RAFT. These analogs show 

effects on nerve regeneration without immunosuppressive effects. 

Naturally occurring FK-520 and FK-506 analogs include the antascomycins, 

which are FK-506-like macrolides that lack the functional groups of FK-506 that bind to 

calcineurin (see Fehr et ah, 1996, The Journal of Antibiotics 49: 230-233). These 
1 5 molecules bind FKBP as effectively as does FK-506; they antagonize the effects of both 

FK-506 and rapamycin, yet lack immunosuppressive activity. 




"FKBP binding domain' 
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Antascomycin A 

Other analogs can be produced by chemically modifying FK-506, FK-520, or 
rapamycin. One approach to obtaining neuroimmunophilin ligands is to destroy the 
effector binding region of FK-506, FK-520, or rapamycin by chemical modification. 

5 While the chemical modifications permitted on the parent compounds are quite limited, 
some useful chemically modified analogs exist. The FK-520 analog L-685,818 (ED 5 o = 
0.7 nM for FKBP binding; see Dumont et al, 1992), and the rapamycin analog WAY- 
124,466 (IC 5 o = 12.5 nM; see Ocain et ah, 1993, Biochemistry Biophysical Research 
Communications 192: 1340-134693) are about as effective as FK-506, FK-520, and 

10 rapamycin at promoting neurite outgrowth in sensory neurons (see Steiner et ah, 1997). 




L-685,818 WAY-1 24,466 

One of the few positions of rapamycin that is readily amenable to chemical 
modification is the allylic 16-methoxy group; this reactive group is readily exchanged by 
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acid-catalyzed nucleophilic substitution. Replacement of the 16-methoxy group of 
rapamycin with a variety of bulky groups has produced analogs showing selective loss of 
immunosuppressive activity while retaining FKBP-binding (see Luengo et ai, 1995, 
Chemistry & Biology 2: 471-481). One of the best compounds, 1, below, shows complete 
5 loss of activity in the splenocyte proliferation assay with only a 10-fold reduction in 
binding to FKBP. 




1 



There are also synthetic analogs of FKBP binding domains. These compounds 
10 reflect an approach to obtaining neuroimmunophilin ligands based on "rationally 

designed" molecules that retain the FKBP-binding region in an appropriate conformation 
for binding to FKBP, but do not possess the effector binding regions. In one example, the 
ends of the FKBP binding domain were tethered by hydrocarbon chains (see Holt et aL, 
1993, Journal of the American Chemical Society 115: 9925-9938); the best analog, 2, 
15 below, binds to FKBP about as well as FK-506. In a similar approach, the ends of the 
FKBP binding domain were tethered by a tripeptide to give analog 3, below, which binds 
to FKBP about 20-fold poorer than FK-506. These compounds are anticipated to have 
neuroimmunophilin binding activity. 
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2 3 



In a primate MPTP model of Parkinson's disease, administration of FKBP ligand 
GPI-1046 caused brain cells to regenerate and behavioral measures to improve. MPTP is 
5 a neurotoxin, which, when administered to animals, selectively damages nigral-striatal 
dopamine neurons in the brain, mimicking the damage caused by Parkinson's disease. 
Whereas, before treatment, animals were unable to use affected limbs, the FKBP ligand 
restored the ability of animals to feed themselves and gave improvements in measures of 
locomotor activity, neurological outcome, and fine motor control. There were also 

10 corresponding increases in regrowth of damaged nerve terminals. These results 
demonstrate the utility of FKBP ligands for treatment of diseases of the CNS. 

From the above description, two general approaches towards the design of non- 
immunosuppressant, neuroimmunophilin ligands can be seen. The first involves the 
construction of constrained cyclic analogs of FK-506 in which the FKBP binding domain 

1 5 is fixed in a conformation optimal for binding to FKBP. The advantages of this approach 
are that the conformation of the analogs can be accurately modeled and predicted by 
computational methods, and the analogs closely resemble parent molecules that have 
proven pharmacological properties. A disadvantage is that the difficult chemistry limits 
the numbers and types of compounds that can be prepared. The second approach involves 

20 the trial and error construction of acyclic analogs of the FKBP binding domain by 

conventional medicinal chemistry. The advantages to this approach are that the chemistry 
is suitable for production of the numerous compounds needed for such interactive 
chemistry-bioassay approaches. The disadvantages are that the molecular types of 
compounds that have emerged have no known history of appropriate pharmacological 
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properties, have rather labile ester functional groups, and are too conformational^ mobile 
to allow accurate prediction of conformational properties. 

The present invention provides useful methods and reagents related to the first 
approach, but with significant advantages. The invention provides recombinant PKS 
5 genes that produce a wide variety of polyketides that cannot otherwise be readily 

synthesized by chemical methodology alone. Moreover, the present invention provides 
polyketides that have either or both of the desired immunosuppressive and neurotrophic 
activities, some of which are produced only by fermentation and others of which are 
produced by fermentation and chemical modification. Thus, in one aspect, the invention 

10 provides compounds that optimally bind to FKBP but do not bind to the effector proteins. 
The methods and reagents of the invention can be used to prepare numerous constrained 
cyclic analogs of FK-520 in which the FKBP binding domain is fixed in a conformation 
optimal for binding to FKBP. Such compounds will show neuroimmunophilin binding 
(neurotrophic) but not immunosuppressive effects. The invention also allows direct 

1 5 manipulation of FK-520 and related chemical structures via genetic engineering of the 
enzymes involved in the biosynthesis of FK-520 (as well as related compounds, such as 
FK-506 and rapamycin); similar chemical modifications are simply not possible because 
of the complexity of the structures. The invention can also be used to introduce "chemical 
handles" into normally inert positions that permit subsequent chemical modifications. 

20 Several general approaches to achieve the development of novel 

neuroimmunophilin ligands are facilitated by the methods and reagents of the present 
invention. One approach is to make "point mutations" of the functional groups of the 
parent FK-520 structure that bind to the effector molecules to eliminate their binding 
potential. These types of structural modifications are difficult to perform by chemical 

25 modification, but can be readily accomplished with the methods and reagents of the 
invention. 

A second, more extensive approach facilitated by the present invention is to 
utilize molecular modeling to predict optimal structures ab initio that bind to FKBP but 
not effector molecules. Using the available X-ray crystal structure of FK-520 (or FK-506) 
30 bound to FKBP, molecular modeling can be used to predict polyketides that should 
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optimally bind to FKBP but not calcineurin. Various macrolide structures can be 
generated by linking the ends of the FKBP-binding domain with "all possible" polyketide 
chains of variable length and substitution patterns that can be prepared by genetic 
manipulation of the FK-520 or FK-506 PKS gene cluster in accordance with the methods 
5 of the invention. The ground state conformations of the virtual library can be determined, 
and compounds that possess binding domains most likely to bind well to FKBP can be 
prepared and tested. 

Once a compound is identified in accordance with the above approaches, the 
invention can be used to generate a focused library of analogs around the lead candidate, 

10 to "fine tune" the compound for optimal properties. Finally, the genetic engineering 
methods of the invention can be directed towards producing "chemical handles" that 
enable medicinal chemists to modify positions of the molecule previously inert to 
chemical modification. This opens the path to previously prohibited chemical 
optimization of lead compounds by time-proven approaches. 

15 Moreover, the present invention provides polyketide compounds and the 

recombinant genes for the PKS enzymes that produce the compounds that have 
significant advantages over FK-506 and FK-520 and their analogs. The metabolism and 
pharmacokinetics of tacrolimus has been exstensively studied, and FK-520 is believed to 
be similar in these respects. Absorption of tacrolimus is rapid, variable, and incomplete 

20 from the gastrointestinal tract (Harrison's Principles of Internal Medicine, 14th edition, 
1998, McGraw Hill, 14, 20, 21, 64-67). The mean bioavailability of the oral dosage form 
is 27%, (range 5 to 65%). The volume of distribution (VolD) based on plasma is 5 to 65 
L per kg of body weight (L/kg), and is much higher than the VolD based on whole blood 
concentrations, the difference reflecting the binding of tacrolimus to red blood cells. 

25 Whole blood concentrations may be 12 to 67 times the plasma concentrations. Protein 
binding is high (75 to 99%), primarily to albumin and alphal-acid glycoprotein. The half- 
life for distribution is 0.9 hour; elimination is biphasic and variable: terminal-1 1.3 hr 
(range, 3.5 to 40.5 hours). The time to peak concentration is 0.5 to 4 hours after oral 
administration. 
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Tacrolimus is metabolized primarily by cytochrome P450 3A enzymes in the liver 
and small intestine. The drug is extensively metabolized with less than 1% excreted 
unchanged in urine. Because hepatic dysfunction decreases clearance of tacrolimus, doses 
have to be reduced substantially in primary graft non- function, especially in children. In 
5 addition, drugs that induce the cytochrome P450 3A enzymes reduce tacrolimus levels, 
while drugs that inhibit these P450s increase tacrolimus levels. Tacrolimus bioavailability 
doubles with co-administration of ketoconazole, a drug that inhibits P450 3 A. See, 
Vincent et al, 1992, In vitro metabolism of FK-506 in rat, rabbit, and human liver 
microsomes: Identification of a major metabolite and of cytochrome P450 3 A as the 

10 major enzymes responsible for its metabolism, Arch Biochem. Biophys. 294: 454-460; 
Iwasaki etal, 1993, Isolation, identification, and biological activities of oxidative 
metabolites of FK-506, a potent immunosuppressive macrolide lactone, Drug Metabolism 
& Disposition 21: 971-977; Shiraga et a/., 1994, Metabolism of FK-506, a potent 
immunosuppressive agent, by cytochrome P450 3A enzymes in rat, dog, and human liver 

15 microsomes, Biochem. Pharmacol 47: 727-735; and Iwasaki et al, 1995, Further 
metabolism of FK-506 (Tacrolimus); Identification and biological activities of the 
metabolites oxidized at multiple sites of FK-506, Drug Metabolism & Disposition 23: 28- 
34. The cytochrome P450 3 A subfamily of isozymes has been implicated as important in 
this degradative process. 

20 Structures of the eight isolated metabolites formed by liver microsomes are shown 

in Figure 6. Four metabolites of FK-506 involve demethylation of the oxygens on 
carbons 13, 15, and 31, and hydroxylation of carbon 12. The 13-demethylated (hydroxy) 
compounds undergo cyclizations of the 13-hydroxy at C-10 to give MI, MVI and MVII, 
and the 12-hydroxy metabolite at C-10 to give I. Another four metabolites formed by 

25 oxidation of the four metabolites mentioned above were isolated by liver microsomes 
from dexamethasone treated rats. Three of these are metabolites doubly demethylated at 
the methoxy groups on carbons 15 and 31 (M-V), 13 and 31 (M-VI), and 13 and 15 (M- 
VII). The fourth, M-VIII, was the metabolite produced after demethylation of the 31- 
methoxy group, followed by formation of a fused ring system by further oxidation. 

30 Among the eight metabolites, M-II has immunosuppressive activity comparable to that of 
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FK-506, whereas the other metabolites exhibit weak or negligible activities. Importantly, 
the major metabolite of human, dog, and rat liver microsomes is the 13-demethylated and 
cyclized FK-506 (M-I). 

Thus, the major metabolism of FK-506 proceeds via 13-demethylation followed 
5 by cyclization to the inactive M-I, this representing about 90% of the metabolic products 
after a 10 minute incubation with liver microsomes. Analogs of tacrolimus that do not 
possess a C-13 methoxy group would not be susceptible to the first and most important 
biotransformation in the destructive metabolism of tacrolimus (i.e. cyclization of 13- 
hydroxy to C-10). Thus, a 13-desmethoxy analog of FK-506 should have a longer half- 

10 life in the body than does FK-506. The C-13 methoxy group is believed not to be 
required for binding to FKBP or calcineurin. The C-13 methoxy is not present on the 
identical position of rapamycin, which binds to FKBP with equipotent affinity as 
tacrolimus. Also, analysis of the 3-dimensional -structure of the FKBP-tacrolimus- 
calcineurin complex shows that the C-13 methoxy has no interaction with FKBP and only 

1 5 a minor interaction with calcineurin. The present invention provides C- 13-desmethoxy 
analogs of FK-506 and FK-520, as well as the recombinant genes that encode the PKS 
enzymes that catalyze their synthesis and host cells that produce the compounds. 

These compounds exhibit, relative to their naturally occurring counterparts, 
prolonged immunosuppressive action in vivo, thereby allowing a lower dosage and/or 

20 reduced frequency of administration. Dosing is more predictable, because the variability 
in FK-506 dosage is largely due to variation of metabolism rate. FK-506 levels in blood 
can vary widely depending on interactions with drugs that induce or inhibit cytochrome 
P450 3A (summarized in USP Drug Information for the Health Care Professional). Of 
particular importance are the numerous drugs that inhibit or compete for CYP 3 A, 

25 because they increase FK-506 blood levels and lead to toxicity (Prograf package insert, 
FujisawaDUS, Rev 4/97, Rec 6/97). Also important are the drugs that induce P450 3A 
(e.g. Dexamethasone), because they decrease FK-506 blood levels and reduce efficacy. 
Because the major site of CYP 3 A action on FK-506 is removed in the analogs provided 
by the present invention, those analogs are not as susceptible to drug interactions as the 

30 naturally occurring compounds. 
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Hyperglycemia, nephrotoxicity, and neurotoxicity are the most significant adverse 
effects resulting from the use of FK-506 and are believed to be similar for FK-520. 
Because these effects appear to occur primarily by the same mechanism as the 
immunosuppressive action (i.e. FKBP-calcineurin interaction), the intrinsic toxicity of the 
5 desmethoxy analogs may be similar to FK-506. However, toxicity of FK-506 is dose 
related and correlates with high blood levels of the drug (Prograf package insert, 
FujisawaDUS, Rev 4/97, Rec 6/97). Because the levels of the compounds provided by 
the present invention should be more controllable, the incidence of toxicity should be 
significantly decreased with the 13-desmethoxy analogs. Some reports show that certain 
1 0 FK-506 metabolites are more toxic than FK-506 itself, and this provides an additional 
reason to expect that a CYP 3 A resistant analog can have lower toxicity and a higher 
therapeutic index. 

Thus, the present invention provides novel compounds related in structure to FK- 
506 and FK-520 but with improved properties. The invention also provides methods for 

1 5 making these compounds by fermentation of recombinant host cells, as well as the 

recombinant host cells, the recombinant vectors in those host cells, and the recombinant 
proteins encoded by those vectors. The present invention also provides other valuable 
materials useful in the construction of these recombinant vectors that have many other 
important applications as well. In particular, the present invention provides the FK-520 

20 PKS genes, as well as certain genes involved in the biosynthesis of FK-520 in 
recombinant form. 

FK-520 is produced at relatively low levels in the naturally occurring cells, 
Streptomyces hygroscopicus var. ascomyceticus, in which it was first identified. Thus, 
another benefit provided by the recombinant FK-520 PKS and related genes of the 

25 present invention is the ability to produce FK-520 in greater quantities in the recombinant 
host cells provided by the invention. The invention also provides methods for making 
novel FK-520 analogs, in addition to the desmethoxy analogs described above, and 
derivatives in recombinant host cells of any origin. 

The biosynthesis of FK-520 involves the action of several enzymes. The FK-520 

30 PKS enzyme, which is composed of the fkbA,fkbB,flcbQ mdfkbP gene products, 
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synthesizes the core structure of the molecule. There is also a hydroxylation at C-9 
mediated by the P450 hydroxylase that is the jkbD gene product and that is oxidized by 
the jkbO gene product to result in the formation of a keto group at C-9. There is also a 
methylation at C-3 1 that is mediated by an Omethyltransferase that is the JkbMgtnt 
5 product. There are also methylations at the C-13 and C-15 positions by a 

methy transferase believed to be encoded by the fkbG gene; this methy transferase may 
act on the hydroxymalonyl CoA substrates prior to binding of the substrate to the AT 
domains of the PKS during polyketide synthesis. The present invention provides the 
genes encoding these enzymes in recombinant form. The invention also provides the 

1 0 genes encoding the enzymes involved in ethylmalonyl CoA and 2-hydroxymalonyl CoA 
biosynthesis in recombinant form. Moreover, the invention provides Streptomyces 
hygroscopicus var. ascomyceticus recombinant host cells lacking one or more of these 
genes that are useful in the production of useful compounds. 

The cells are useful in production in a variety of ways. First, certain cells make a 

1 5 useful FK-520-related compound merely as a result of inactivation of one or more of the 
FK-520 biosynthesis genes. Thus, by inactivating the C-3 1 O-methyltransferase gene in 
Streptomyces hygroscopicus var. ascomyceticus, one creates a host cell that makes a 
desmethyl (at C-31) derivative of FK-520. Second, other cells of the invention are unable 
to make FK-520 or FK-520 related compounds due to an inactivation of one or more of 

20 the PKS genes. These cells are useful in the production of other polyketides produced by 
PKS enzymes that are encoded on recombinant expression vectors and introduced into 
the host cell. 

Moreover, if only one PKS gene is inactivated, the ability to produce FK-520 or 
an FK-520 derivative compound is restored by introduction of a recombinant expression 

25 vector that contains the functional gene in a modified or unmodified form. The 

introduced gene produces a gene product that, together with the other endogenous and 
functional gene products, produces the desired compound. This methodology enables one 
to produce FK-520 derivative compounds without requiring that all of the genes for the 
PKS enzyme be present on one or more expression vectors. Additional applications and 

30 benefits of such cells and methodology will be readily apparent to those of skill in the art 
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after consideration of how the recombinant genes were isolated and employed in the 
construction of the compounds of the invention. 

The FK-520 biosynthetic genes were isolated by the following procedure. 
Genomic DNA was isolated from Streptomyces hygroscopicus var. ascomyceticus 
5 (ATCC 14891) using the lysozyme/proteinase K protocol described in Genetic 
Manipulation of Streptomyces - A Laboratory Manual (Hopwood et aL 9 1986). The 
average size of the DNA was estimated to be between 80- 120 kb by electrophoresis on 
0.3% agarose gels. A library was constructed in the SuperCos™ vector according to the 
manufacturer's instructions and with the reagents provided in the commercially available 

10 kit (Stratagene). Briefly, 100 jig of genomic DNA was partially digested with 4 units of 
Sau3A I for 20 min. in a reaction volume of 1 mL, and the fragments were 
dephosphorylated and ligated to SuperCos vector arms. The ligated DNA was packaged 
and used to infect log-stage XLl-BlueMR cells. A library of about 10,000.independent 
cosmid clones was obtained. 

1 5 Based on recently published sequence from the FK-506 cluster (Motamedi and 

Shafiee, 1998, Eur. J. Biochem. 256: 528), a probe for the fkbO gene was isolated from 
ATCC 14891 using PCR with degenerate primers. With this probe, a cosmid designated 
pKOS034-124 was isolated from the library. With probes made from the ends of cosmid 
pKOS034-124, an additional cosmid designated pKOS034-120 was isolated. These 

20 cosmids (pKOS034-124 and pKOS034-120) were shown to contain DNA inserts that 
overlap with one another. Initial sequence data from these two cosmids generated 
sequences similar to sequences from the FK-506 and rapamycin clusters, indicating that 
the inserts were from the FK-520 PKS gene cluster. Two EcoRI fragments were 
subcloned from cosmids pKOS034-124 and pKOS034-120. These subclones were used 

25 to prepare shotgun libraries by partial digestion with Sau3M 9 gel purification of 

fragments between 1.5 kb and 3 kb in size, and ligation into the pLitmus28 vector (New 
England Biolabs). These libraries were sequenced using dye terminators on a Beckmann 
CEQ2000 capillary electrophoresis sequencer, according to the manufacturer's protocols. 
To obtain cosmids containing sequence on the left and right sides of the 

30 sequenced region described above, a new cosmid library of ATCC 14891 DNA was 
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prepared essentially as described above. This new library was screened with a nev/fkbM 
probe isolated using DNA from ATCC 1 4891 . A probe representing the JkbP gene at the 
end of cosmid pKOS034-124 was also used. Several additional cosmids to the right of the 
previously sequenced region were identified. Cosmids pKOS065-C31 and pKOS065-C3 
5 were identified and then mapped with restriction enzymes. Initial sequences from these 
cosmids were consistent with the expected organization of the cluster in this region. More 
extensive sequencing showed that both cosmids contained in addition to the desired 
sequences, other sequences not contiguous to the desired sequences on the host cell 
chromosomal DNA. Probing of additional cosmid libraries identified two additional 

1 0 cosmids, pKOS065-M27 and pKOS065-M2 1 , that contained the desired sequences in a 
contiguous segment of chromosomal DNA. Cosmids pKOS034-124, pKOS034-120, 
pKOS065-M27, and pKOS065-M21 have been deposited with the American Type 
Culture Collection, Manassas, VA, USA. The complete nucleotide sequence of the 
coding sequences of the genes that encode the proteins of the FK-520 PKS are shown 

1 5 below but can also be determined from the cosmids of the invention deposited with the 
ATCC using standard methodology. 

Referring to Figures 1 and 3, the FK-520 PKS gene cluster is composed of four 
open reading frames designated JkbB,JkbC,JkbA, and JkbP. The JkbB open reading frame 
encodes the loading module and the first four extender modules of the PKS. The JkbC 

20 open reading frame encodes extender modules five and six of the PKS. The fkbA open 
reading frame encodes extender modules seven, eight, nine, and ten of the PKS. The JkbP 
open reading frame encodes the NRPS of the PKS. Each of these genes can be isolated 
from the cosmids of the invention described above. The DNA sequences of these genes 
are provided below preceded by the following table identifying the start and stop codons 

25 of the open reading frames of each gene and the modules and domains contained therein. 

Nucleotides Gene or Domain 

complement (412 - 1836) JkbW 

complement (2020 - 3579) JkbV 

30 complement (3969 - 4496) JkbR2 

complement (4595 - 5488) JkbRl 

5601 -6818 JkbE 
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OoVo - oUDZ 






OI 30 - ooz4 






complement (9122 - 


9883) 




complement (9894 - 


10994) 


5 


complement (10987 


- 11247) 




complement (11244 


- 12092) 




complement (12113 


- 13150) 




complement (13212 


- 23988) 




complement (ziyyz 


- 46573) 


1 n 


40 / j4 - 4/ /05 






4/ /5j - JZZ/Z 






oil Id - /14o5 






/ 140Z - /Z0Z5 






*70AOC '7 , )/|AT 

/zo23 - 73407 




i c 

i j 


complement (73460 


- 76202) 




complement (76336 


- 77080) 




complement (77076 


- 77535) 




complement (44974 


- 46573) 




complement (43777 


- 44629) 


zu 


complement (43 1 44 


■ 43660) 




complement (4 1 842 


- 43093) 




complement(40609 - 


41842) 




complement (39442 ■ 


- 40609) 




complement (38677 • 


- 39307) 


z!> 


complement (38371 ■ 


■38581) 




complement (37145 • 


-38296) 




complement (35749 • 


-37144) 




complement (34606 - 


■ 35749) 




complement (33823 - 


■ 34480) 


OA 
JV 


complement (33505 - 


•33715) 




complement (32185 - 


- 33439) 




complement (31018- 


32185) 




complement (29869 - 


31018) 




complement (29092 - 


29740) 


1^ 


complement (28750 - 


28960) 




complement (27430 - 


28684) 




complement (zol4o - 


27430) 




complement (24997 - 


26146) 




complement (24163 - 


24373) 


40 


complement (22653 - 


23892) 




complement (21420- 


22653) 




complement (20241 - 


21420) 




complement (19464 - 


20097) 




complement (19116- 


19326) 
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fkbF 

flcbG 

flcbH 

flcbl 

JkbJ 

fkbK 

flcbL 

fkbC 

flcbB 

fkbO 

jkbP 

fkbA 

JkbD 

jkbM 

JkbN 

JkbQ 

flcbS 

CoA ligase of loading domain 

ER of loading domain 

ACP of loading domain 

KS of extender module 1 (KS1) 

ATI 

DH1 

KR1 

ACPI 

KS2 

AT2 

DH2 (inactive) 

KR2 

ACP2 

KS3 

AT3 

DH3 (inactive) 

KR3 

ACP3 

KS4 

AT4 

DH4 (inactive) 

ACP4 

KS5 

AT5 

DH5 

KR5 

ACP5 
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comnlempnt ( \ 7870 


1 on^^ 

1 yVOj ) 




comnlempnt f!6^87 




AI6 


comnlempnt (\ S4^ft - 


i ODo / ) 


L>rio 


comnlement f 1 4^ 17- 


1 ^OG^T* 




comnlempnt ( \ 3761 


IzHQzn 


JtvKo 


comnlempnt H34S7 - 


1 JOOZ J 


ALro 


52362 - 53576 


Kb/ 


53577-54716 




AT7 


54717 - 55871 




DH7 


56019 - 56819 

Jv/ui / JUO 1 7 




bR7 






KR7 


5771 fl - S7Q70 




ACP7 


57990 - SQ74^ 




KS8 






AT8 


60399 - 61417 
wjyy ui*tiz 




DH8 (inactive) 


61548 - 67180 




KR8 


62328 - 67537 




ACP8 


62598 - 63854 




KS9 


63855 - 65084 




A TO 

Al y 


65085 - 66254 




DH9 


66399-67175 




ER9 


67299 - 67931 




KR9 


68094 - 68303 




ACP9 


68397 - 69653 




KS10 


69654 - 70985 




AT10 


71064-71273 




ACP10 



1 GATCTCAGGC ATGAAGTCCT CCAGGCGAGG 
61 TGTACGGACC ACTTCAGTCA GCGGCGATTG 
121 TTACAAGATC CTCACATTGC GCGACCGCCA 
181 GAAAGGGCGC GGGCGGTCCG CACCAGGGCG 
241 ACCGTCACCT CTCTCCCCCG CCGGCGGGAT 
301 ACGCTGAACA CCCGCGCGGT GTGGCGTCGG 
361 TACGGGGAGG GCGTACGGCG GCCGTGGCTC 
4 21 GAGACGGCAC TCGGCGAGCA GGGACGCCTG 
4 81 GTTCGCGGGC GGGCGGTGGC CGGTGGTGAG 
541 GTGACACGGC AGCAAAGGCC GGAGTCGGTC 
601 CGTGCCGTCC TCGATGCGGT AGTAGCGGTA 
661 GCGTACACGT CGGAGCCCGG GCGGCAGGCA 
721 CAGCGGCTTG CCGATACGAC CGGTCAACGC 
781 GGAGCGGGTG GCGTAGTCGT AGTCGGCATC 
841 CGGTGTGCCG GCTTCCTTCT CCCCATCGAA 
901 CTGCGTCAGA TCCCAGTAGA CCTCGTGGTG 
961 GAACCCGGCG CGGAGCAGCG CCTCGCGCGC 
1021 GGTGGGGTAG TCGCGCAGGG CGGCCGGCAG 
1081 CCACAGGGTG CCTTCCCAGT CGACTCCTCC 
1141 CCAGCGCACG AGGTAGCCGC CGTTGGACAT 
1201 GTGGTAGCGC TGGGCGACCG ACGCGCGGGC 



CGCCGAGGTG GTGAACACCT CGCCGCTGC7 
CGGAACCAAG TCATCCGGAA TAAAGGGCGG 
GCATACGCTG AGTTGCCTCA GAGGCAAACC 
GAGTACGCGA CGAGAGTGGC GCACCCGCGC 
GCCCGGCGTG ACACGGTTGG GCTCTCCTCG 
GGACACCGCC TGGCATCGGC CGGGTGACGG 
GTGCTCACGG CCGCCGGGCG GTCATCCGTC 
GTCGGCACCT GCGGGCCGGA CGACCGTGTG 
CCAGCTCTCC AGGGCGGTGA AGGCTGAGCG 
GGGGAAGGTG TCGACGAGGG CGTCGGTGTG 
CCGGCCGCCA GGCCGCTGCC GGACATACGC 
GCAGCACGTC GAGAGTGCCT GGATGGTGAT 
GATGCGTTCC ACGGCCGCGT GGACGCCGGA 
GCAGCCCGGG ACCGTCCCCG GGGCGCAATA 
GCCGGGGTCG AACTCCTCGC GGTAGACGCG 
GTACGGCCAC AAGAACTCGG AGTCGGCCGG 
CTGGCCGGCT GCGGGGCCGC CTGCCGCGTA 
GAAGGTGAAG AGGTTGGGAC CCTCCGCGCG 
GTCGTACAGC TCGGGATGGT TCTCCAGCTG 
CCCGGTGACC AGGGTGCGCT CGAGCGGCCG 
GGCCCGGGTC AGCTGGGTGA GGCGGGTGTT 
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12 61 CCACTCGGCG ACGGCGTCGC CCGGCCGGGA GCCATCACGG TAGAACGCGG GGCCGGTGTT 
1321 GCCCTTGTCG GTGGCGGCGT AGGCGTAACC GCGGGCGAGC ACCCAGTCGG CGATGGCCCG 
1381 GTCGTTGGCG TACTGCTCGC GGTTACCGGG GGTGCCGGCC ACGACCAGGC CACCGTTCCA 
14 41 GCGGUCGGGC AGCCGGATGA CGAACTGGGC GTCGTGGTTC CACCCGTGGT TGGTGTTGGT 
5 1501 GGTGGAGGTG TCGGGGAAGT AGCCGTCGAT CTGGATCCCG GGCACTCCGG TGGGAGTGGC 
1561 CAGGTTCTTG GGCGTCAGCC CTGCCCAGTC CGCCGGGTCG GTGTGGCCGG TGGCCGCCGT 
1621 TCCCGCCGTG GTCAGCTCGT CCAGGCAGTC GGCCTGCTGA CGTGCCGCCG CCGGGACACG 
1681 CAGCTGGGAC AGACGGGCGC AGTGACCGTC CGGGGCATCG GGAGCAGGCC GGGCCGTGGC 

17 41 CGGTGAGGGG AGCAGGACGG CGACTGCGGC CAGGGTGAGA GCGCCGAGGC CGGTGCGTCT 
10 18 01 TCTCGGGGCC CGTCCGACAC CGAGGGGCAG AACCATGGAG AGCCTCCAGA CGTGCGGATG 

18 61 GATGACGGAC TGGAGGCTAG GTCGCGCACG GTGGAGACGA ACATGGGTGC GCCCGCCATG 
1921 ACTGAGGCCC CTCAGAGGTG GGCCGCCGCC ATGACGGGCG CGGGACCGCG GGCGCTCCGG 
1981 GGCGGTGCCC GCGGCCGCCA CCGGTTCCGG GTCCCCGGGT CAGGGACAGG TGTCGTTCGC 
2041 GACGGTGAAG TAGCCGGTCG GCGACTCTTT CAAGGTGGTC GTGACGAAGG TGTTGTACAG 

15 2101 GCCCATGTTC TGGCCGGAGC CCTTGGCGTA GGTGTAACCG GCGCTCGTCG TGGCGCGGCC 
2161 CGCCTGGACG TGAGCGTAGT TGCCGGCGGT CCAGCAGACG GCCGTGGCAC CGGTCGTCTG 
2221 CGCGGTGACC GCGCCCGAGA GCGGTCCGGC CTTGCCGTCC GCGTCCCGGG CGGCGACCGC 
2281 GTAGGTGTGC GATGTGCCCG CCCTCAGGCC GGTGTCCGTG TACGACGTCG TGGCGGACGT 
2341 GGTGATCTGG GCACCGTCGC GGTGGACGGC GTAGTCGGTG GCGCCGTCGA CGGGTTTCCA 

20 24 01 GGTCAGGCTG ATGGTGGTGT CGGTGGCGCC GGTGGCGGCC AGGCCGGACG GAGCGGGCAG 
24 61 CGAACCGGGG TCGGAGGCGG ATCCGCTCAG GCCGAAGAAC TGCGTGATCC AGTAGCTGGA 
2521 AC AG AT CG AG TCCAGGAAGT AGGCGGCGCC GGTGCTGCCG CACTGCTGTG CTCCGGTGCC 
2581 GGGATCGACC GGGGTGCCGT GCCCGATGCC CGGCACCCGG TTCACCTCCA CGGCCACCGA 
2 641 TCCGTCCGCG GCCAGGTACT CCTCGTGCCG GGTGGAGTTC GGGCCGATCA CCGAGGTACG 

25 2701 GTCCGGCGTC TGGGACACGC CGTGCACAGC GGTCCACTGG TCGCGCAACT CGTCGGCGTT 

27 61 GCGCGGCGCG ACGGTGGTGT CCTTGTCGCC GTGCCAGATG GCCACGCGCG GCCACGGGCC 
2821 CGACCACGAG GGGTAGCCGT CACGGACCCG CCGCGCCCAC TGGTCCGCGG TCAGGTCGGT 

28 81 CCCGGGGTTC ATGCACAGGT ACGCGCTGCT GACGTCGGTG GCACAGCCGA AGGGCAGGCC 
2941 GGCGACGACC GCGCCGGCCT GGAAGACGTC CGGATAGGTG GCGAGCATCA CCGACGTCAT 

30 3001 GGCACCGCCG GCGGACAGCC CGGTGATGTA GGTGCGCTGG GGGTCCGCGC CGTAGGCGGA 
3061 GACGGTGTGA GCGGCCATCT GCCGGATCGA CGCGGCTTCG CCCTGGCCCC TGCGGTTGTC 
3121 GCTGCTCTGG AACCAGTTGA AGCACCTGTT CGCGTTGTTC GACGACGTGG TCTCGGCGAA 
3181 CACGAGCAGG AAGCCATAGC GGTCCGCGAA TGAGAGCAGG CCGGAGTTGT CGGCGTAGCC 
3241 CTGGGCGTCC TGGGTGCAAC CGTGCAGGGC GAACACCACC GCCGGCTCCG CGGGCAGGGA 

35 3301 CGCGGGCCGG TAGACGTACA TGTTCAGCCG GCCCGGGTTC GTGCCGAAGT CCGCGACCTC 
3361 GGTCAGGTCC GCCTTGGTCA GACCGGGCTT GGCCAGGCCC GCCGCGGCGT GGGCCGTCGG 
3421 CGCCGGGCCG AGCAGGGCCG CTCCGAGTAC GAGGGCCACG ACGGCCACGA GACGGGTGAG 
34 81 CACCCCCCGC CGTCCCGGAC GCGACAACGA CCCGACCGGC GGCGAGGAGG AGAGGGGGAA 
3541 CAGCGGGGTG AGGATTCCCC GGAACGGCGG CGGCTGCATG GCGGCTCCCT CGATGTCGTG 

40 3 601 GGGGGGACAC GGAGGGCTCC CTGACGTCGA TCAGTGGGAG CGCCCCGGTG CCCGGCACCG 
36 61 TAGGGGTGGT TCAACCCGCA ACGGTATGGC CCGGAGCACC ACACCCCGCA CCGCGCGATG 
3721 TGCGCCCGGA CGGATTGTGT CGCCTTGCGG AATCTGATAC CCGGACGCGA CGAACGCCCC 
3781 ACCCGACACG GGTAGGGCGT CATGGTGTCC GACTCGGCCG GTCGGCCTTG CCTGCCCTGG 
3841 ACGGACCGGG CGTCGGCGGA CCGGGCGTCG GCGGGCTGGG CGGTATGGCG GCCGAGGACG 

45 3 901 CCAGCCGCGT GGGGCGGCCG CGCCCAAGTG CAGTACGCCG ACCGTGGCCG GCGGGAGGGC 
3961 CGGACCGGTC AGTGCAGTCC CGCGGCCCTG CGGGACCGCT CGTCCCAGAC GGGTTCCACC 
4 021 GCGGCGAACC GGGGTCCGTG TCCGCGGCGG TAGACCATCA GTGTCCGCTC GAAGGTGATG 
4 081 ACGATGACAC CGTCCTGGTT GTAGCCGATG GTGCGCACGC TGATGATGCC TACGTCAGGT 
4141 CGGCTGGCGG ACTCCCGGGT GTTCAGGACC TCGGACTGCG AGTAGATGGT GTCGCCCTCG 

50 4 201 AAGACCGGGT TCGGCAGCC1 GACCCGGTCC CAGCCGAGGT TGGCCATCAC ATGCTGGGAG 
4261 ATGTCGGTGA CGCTCTGCCC GGTGACCAGG GCGAGGGTGA AGGTGGAGTC CACCAGCGGC 
4 321 TTGCCCCAGG TGGTGCCCGC CGAGTAGTGG CGGTCGAAGT GCAGCGGCGC GGTGTTCTGC 
4 381 GTCAGGAGCG TGAGCCAGGA GTTGTCGGTC TCCAGGACCG TGCGGCCCAG GGGGTGGCGG 
44 41 TACACGTCGC CGGTGGTGAA GTCCTCGAAG TAGCGGCCCT GCCAGCCCTC GACCACAGCG 
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4 501 GTGCGGGTGG CGTCCTGGTC CGGGTTCTCA GTCGTCATGG CGCTCATTCT GGGAAGTCCC 
4 5 61 CGGTCCGCTG TGAAATGCCG AACCTTCACC GGGCTCATAC GTGCGGCGCA TGAGCCCTGG 
4 621 ACCGTACGTA GTCGTAGAAC CTCGCCACCA CTGGCGCGCG TGGTCCTCCG GCGAGTGTGA 
4 681 CCACGCCGAC CGTGCGCCGC GCCTGCGGGT CGTCGAGCGG CACGGCGACG GCGTGGTCAC 
4741 CGGGCCCGGA CGGGCTGCCG GTGAGGGGGG CGACGGCCAC ACCGAGGCCG GCGGCGACCA 
4 801 GGGCCCGCAG CGTGCTCAGC TCGGTGCTCT CCAGGACGAC CCGCGGCACG AATCCGGCCG 
4 8 61 CGGCGCACAG CCGGTCGGTG ATCTGGCGCA GTCCGAAGAC CGGCTCCAGT GCCACGAACG 
4 921 CCTCATCGGC CAGCTCCGCG GTCCGCACCC GGCGGCGTCT GGCCAGCCGG TGTCCGGGTG 

4 981 GGACGAGCAG GCACAGTGCC TCGTCCCGCA GTGGTGTCCA CTCCACATCG TCCCCGGCGG 
5041 GTCGTGGGCT GGTCAGCCCC AGGTCCAGCC TGCTGTTGCG GACGTCGTCG ACCACGGCGT 
5101 CGGCGGCGTC GCCGCGCAGT TCGAAGGTGG TGCCGGGAGC CAGCCGGCGG TACCCGGCGA 
5161 GGAGGTCGGG CACCAGCCAG GTGCCGTAGG AGTGCAGGAA ACCCAGTGCC ACGGTGCCGG 
5221 TGTCGGGGTC GATCAGGGCG GTGATGCGCT GCTCGGCGCC GGAGACCTCA CTGATCGCGC 
5281 GCAGGGCGTG GGCGCGGAAG ACCTCGCCGT ACTTGTTGAG CCGGAGCCGG TTCTGGTGCC 
5341 GGTCGAACAG CGGCACGCCC ACTCGTCGCT CCAGCCGCCG GATGGCCCTG GACAGGGTCG 
54 01 GCTGGGAGAT GTTGAGCCGT TCCGCGGTGA TCGTCACGTG CTCGTGCTCG GCCAAGGCCG 
54 61 TGAACCACTG CAACTCCCGT ATCTCCATGC AGGGACTATA CGTACCGGGC ATGGTCCTGG 
5521 CGAGGTTTCG TCATTTCACA GCGGCCGGGC GGCGGCCCAC AGTGAGTCCT CACCAACCAG 
5581 GACCCCATGG GAGGGACCCC ATGTCCGAGC CGCATCCTCG CCCTGAACAG GAACGCCCCG 

5 641 CCGGGCCCCT GTCCGGTCTG CTCGTGGTTT CTTTGGAGCA GGCCGTCGCC GCTCCGTTCG 
57 01 CCACCCGCCA CCTGGCGGAC CTGGGCGCCC GTGTCATCAA GATCGAACGC CCCGGCAGCG 

57 61 GCGACCTCGC CCGCGGCTAC GACCG CACGG TGCGTGGCAT GTCCAGCCAC TTCGTCTGGC 
5821 TGAACCGGGG GAAGGAGAGC GTCCAGCTCG ATGTGCGCTC GCCGGAGGGC AACCGGCACC 

58 81 TGCACGCCTT GGTGGACCGG GCCGATGTCC TGGTGCAGAA TCTGGCACCC GGCGCCGCGG 
5 941 GCCGCCTGGC ATCGGCCACC AGGTCCTCGC GCGGAGCCAC CGAGGCTGAT CACCTGCGGA 
6001 CATATCCGGC TACGGCAGTA CCGGCTGCTA CCGCGGACCG CAAGGCGTAC GACCTCCTGG 
60 61 TCCAGTGCGA AGCGGGGCTG GTCTCCATCA CCGGCACCCC CGAGACCCCG TCCAAGGTGG 
6121 GCCTGTCCAT CGCGGACATC TGTGCGGGGA TGTACGCGTA CTCCGGCATC CTCACGGCCC 
6181 TGCTGAAGCG GGCCCGCACC GGCCGGGGCT CGCAGTTGGA GGTCTCGATG CTCGAAGCCC 
6241 TCGGTGAATG GATGGGATAC GCCGAGTACT ACACGCGCTA CGGCGGCACC GCTCCGGCCC 
6301 GCGCCGGCGC CAGCCACGCG ACGATCGCCC CCTACGGCCC GTTCACCACG CGCGACGGGC 
6361 AGACGATCAA TCTCGGGCTC CAGAACGAGC GGGAGTGGGC TTCCTTCTGC GGTGTCGTGC 
6421 TACAACGCCC CGGTCTCTGC GACGACCCGC GCTTTTCCGG CAACGCCGAC CGGGTGGCGC 
64 81 ACCGCACCGA GCTCGACGCC CTGGTGAGCG AGGTGACGGG CACGCTCACC GGCGAGGAAC 
6541 TGGTGGCGCG GCTGGAGGAG GCGTCGATCG CCTACGCACG CCAGCGCACC GTGCGGGAGT 
6601 TCAGCGAACA CCCCCAACTG CGTGACCGTG GACGCTGGGC TCCGTTCGAC AGCCCGGTCG 
6661 GTGCGCTGGA GGGCCTGATC CCCCCGGTCA CCTTCCACGG CGAGCACCCG CGGCGGCTGG 
6721 GCCGGGTCCC GGAGCTGGGC GAGCATACCG AGTCCGTCCT GGCGTGGCTG GCCGCGCCCC 
67 81 ACAGCGCCGA CCGCGAAGAG GCCGGCCATG CCGAATGAAC TCACCGGAGT CCTGATCCTG 
6841 GCCGCCGTGT TCCTGCTCGC CGGCGTACGG GGGCTGAACA TGGGCCTGCT CGCGCTGGTC 
6901 GCCACCTTTC TGCTCGGGGT GGTCGCACTC GACCGAACGC CGGACGAGGT GCTGGCGGGT 
6961 TTCCCCGCGA GCATGTTCCT GGTGCTGGTC GCCGTCACGT TCCTCTTCGG GATCGCCCGC 
7021 GTCAACGGCA CGGTGGACTG GCTGGTACGT GTCGCGGTGC GGGCGGTGGG GGCCCGGGTG 
7 081 GGAGCCGTCC CCTGGGTGCT CTTCGGCCTG GCGGCACTGC TCTGCGCGAC AGGCGCGGCC 
7141 TCGCCCGCGG CGGTGGCGAT CGTGGCGCCG ATCAGCGTCG CGTTCGCCGT CAGGCACCGC 
7201 ATCGATCCGC TGTACGCCGG ACTGATGGCG GTGAACGGGG CCGCAGCCGG CAGTTTCGCC 
72 61 CCCTCCGGGA TCCTGGGCGG CATCGTCCAC TCGGCGCTGG AGAAGAACCA TCTGCCCGTC 
7321 AGCGGCGGGC TGCTCTTCGC AGGCACCTTC GCCTTCAACC TGGCGGTCGC CGCGGTGTCA 
7381 TGGCTCGTCC TCGGGCGCAG GCGCCTCGAA CCACATGACC TGGACGAGGA CACCGATCCC 
74 41 ACGGAAGGGG ACCCGGCTTC CCGCCCCGGC GCGGAACACG TGATGACGCT GACCGCGATG 
7501 GCCGCGCTGG TGCTGGGAAC CACGGTCCTC TCCCTGGACA CCGGCTTCCT GGCCCTCACC 
7561 TTGGCGGCGT TGCTGGCGCT GCTCTTCCCG CGCACCTCCC AGCAGGCCAC CAAGGAGATC 
7 621 GCCTGGCCCG TGGTGCTGCT GGTATGCGGG ATCGTGACCT ACGTCGCCCT GCTCCAGGAG 
7 681 CTGGGCATCG TGGACTCCCT GGGGAAGATG ATCGCGGCGA TCGGCACCCC GCTGCTGGCC 
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77 41 GCCCTGGTGA TCTGCTACGT GGGCGGTGTC GTCTCGGCCT TCGCCTCGAC CACCGGGATC 
7801 CTCGGTGCCC TGATGCCGCT GTCCGAGCCG TTCCTGAAGT CCGGTGCCAT CGGGACGACC 
7 8 61 GGCATGGTGA TGGCCCTGGC GGCCGCGGCG ACCGTGGTGG ACGCGAGTCC CTTCTCCACC 
7 921 AATGGTGCTC TGGTGGTGGC CAACGCTCCC GAGCGGCTGC GGCCCGGCGT GTACCAGGGG 

7 981 TTGCTGTGGT GGGGCGCCGG GGTGTGCGCA CTGGCTCCCG CGGCCGCCTG GGCGGCCTTC 
8041 GTGGTGGCGT GAGCGCAGCG GAGCGGGAAT CCCCTGGAGC CCGTTTCCCG TGCTGTGTCG 
8101 CTGACGTAGC GTCAAGTCCA CGTGCCGGGC GGGCAGTACG CCTAGCATGT CGGGCATGGC 
8161 TAATCAGATA ACCCTGTCCG ACACGCTGCT CGCTTACGTA CGGAAGGTGT CCCTGCGCGA 
8221 TGACGAGGTG CTGAGCCGGC TGCGCGCGCA GACGGCCGAG CTGCCGGGCG GTGGCGTACT 
8281 GCCGGTGCAG GCCGAGGAGG GACAGTTCCT CGAGTTCCTG GTGCGGTTGA CCGGCGCGCG 
8341 TCAGGTGCTG GAGATCGGGA CGTACACCGG CTACAGCACG CTCTGCCTGG CCCGCGGATT 
84 01 GGCGCCCGGG GGCCGTGTGG TGACGTGCGA TGTCATGCCG AAGTGGCCCG AGGTGGGCGA 
84 61 GCGGTACTGG GAGGAGGCCG GGGTTGCCGA CCGGATCGAC GTCCGGATCG GCGACGCCCG 
8521 GACCGTCCTC ACCGGGCTGC TCGACGAGGC GGGCGCGGGG CCGGAGTCGT TCGACATGGT 
8581 GTTCATCGAC GCCGACAAGG CCGGCTACCC CGCCTACTAC GAGGCGGCGC TGCCGCTGGT 
8641 ACGCCGCGGC GGGCTGATCG TCGTCGACAA CACGCTGTTC TTCGGCCGGG TGGCCGACGA 
87 01 AGCGGTGCAG GACCCGGACA CGGTCGCGGT ACGCGAACTC AACGCGGCAC TGCGCGACGA 
87 61 CGACCGGGTG GACCTGGCGA TGCTGACGAC GGCCGACGGC GTCACCCTGC TGCGGAAACG 
8821 GTGACCGGGG CGATGTCGGC GGCGGTCAGC GTCAGCGTCG ^CGGCGCGGG CCTCGCGGRG 
8881 GGCTCCAGAT GCAGGCGTTC GACGCCGGCG GCGGAAGCGC CCGCCACCTC GGACACGCAG 

8 941 GGGCAGTCGG AGTCCGCGAA GCCCGCGAAC CGGTAGGCGA TCTCCATCAT GCGGTTGCGG 
9001 TCCGTACGCC GGAAGTCCGC CACCAGGTGC GCCCCCGCGC GGGCGCCCTG GTCCGTGAGC 
9061 CAGTTCAGGA TCGTCGCACC GGCACCGAAC GACACGACCC GGCAGGACGT GGCGAGCAGT 
9121 TTCAGGTGCC ACGTCGACGG CTTCTTCTCC AG C AG GAT G A TGCCGACGGC GCCGTGCGGG 
9181 CCGAAGCGGT CGCCCATGGT GACGACGAGG ACCTCATGGG CGGGATCGGT GAGCACGCGC 
9241 GCAGGTCGGC GTCGGAGTAG TGCACGCCGG TCGCGTTCAT CTGGCTGGTC CGCAGCGTCA 
9301 GTTCCTCGAC GCGGCTGAGT TCCTCCTCCC CCGCGGGTGC GATCGTCATG GAGAGGTCGA 

93 61 GCGAGCGCAG GAAGTCCTCG TCGGGACCGG AGTACGCCTC CCGGGCCTGG TCGCGCGCGA 
9421 AACCCGCCTG GTACATCAGG CGGCGCCGAC GCGAGTCGAC CGTGGACACC GGCGGGCTGA 

94 81 ACTCCGGCAG CGACAGGAGC GTGGCCGCCT GCTCGGCCGG GTAGCACCGC ACCTCGGGCA 
9541 GGTGGAACGC CACCTCGGCA CGCTCGGCGG GCTGGTCGTC GATGAACGCG ATCGTGGTCG 
9601 GTGCGAAGTT CAGCTCCGTG GCGATCTCGC GGACGGACTG CGACTTCGGC CCCCATCCGA 
9661 TGCGGGCCAG CACGAAGTAC TCCGCCACAC CGAGGCGTTC CAGACGCTCC CACGCGAGGT 
9721 CGTGGTCGTT CTTGCTCGCC ACCGCCTGGA GGATGCCGCG GTCGTCGAGC GTGGTGATCA 
9781 CCTCGCGGAT CTCGTCGGTG AGGACCACCT CGTCGTCCTC CAGCACGGTG CCCCGCCACA 
98 41 AGGTGTTGTC CAGGTCCCAG ACCAGACACT TGACAATGGT CATGGCTGTC CTCTCAAGCC 
9901 GGGAGCGCCA GCGCGTGCTG GGCCAGCATC ACCCGGCACA TCTCGCTGCT GCCCTCGATG 
9961 ATCTCCATGA GCTTGGCGTC GCGGTACGCC CGTTCGACGA CGTGTCCCTC TCTCGCGCCT 

10021 GCCGACGCGA GCACCTGTGC GGCGGTCGCG GCCCCGGCGG CGGCTCGTTC GGCGGCGACG 
10081 TGCTTGGCCA GGATCGTCGC GGGCACCATC TCGGGCGAGC CCTCGTCCCA GTGGTCGCTG 
10141 GCGTACTCGC ACACGCGGGC CGCGATCTGC TCCGCGGTCC ACAGGTCGGC GATGTGCCCG 
10201 GCGACGAGTT GGTGGTCGCC GAGCGGCCGG CCGAACTGCT CCCGGGTCCG GGCGTGGGCC 
10261 ACCGCGGCGG TGCGGCAGGC CCGCAGGATC CCGACGCAGC CCCAGGCGAC CGACTTGCGC 
10321 CCGTAGGCGA GTGACGCCGC GACCAGCATC GGCAGTGACG CGCCGGAGCC GGCCAGGACC 
10381 GCGCCGGCCG GCACACGCAC CTGGTCCAGG TGCAGATCGG CGTGGCCGGC GGCGCGGCAG 
104 41 CCGGACGGCT TCGGGACGCG CTCGACGCGT ACGCCGGGGG TGTCGGCGGG CACGACCACC 
10501 ACCGCACCGG AACCATCCTC CTGGAGACCG AAGACGACCA GGTGGTCCGC GTAGGCGGCG 
10561 GCAGTCGTCC AGACCTTGTG GCCGTCGACG ACAGCGGTGT CCCCGTCGAG CCGAACCCGC 
10621 GTCCGCATCG CCGACAGATC GCTGCCCGCC TGCCGCTCAC TGAAGCCGAC GGCCGCGAGT 
10681 TTCCCGCTGG TCAGCTCCTT CAGGAAGGTC GCCCGCTGAC CGGCGTCGCC GAGCCGCTGC 
10741 ACGGTCCACG CGGCCATGCC CTGCGACGTC ATGACACTGC GCAGCGAACT GCAGAGGCTG 
10801 CCGACGTGTG CGGTGAACTC GCCGTTCTCC CGGCTGCCGA GTCCCAGACC GCCGTGCTCG 
108 61 GCCGCCACTT CCGCGCAGAG CAGGCCGTCG GCGCCGAGCC GGACGAGCAG GTCGCGCGGC 
10921 AGTTCGCCGG ACGTGTCCCA CTCGGCGGCC CGGTCACCGA CAAGGTCGGT CAGCAGCGCG 
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10981 TCACGCTCAG GCATCGACGG CCCGCAGCCG 
11041 ACGGAAGTTC GCGAGCTGGA GGTCCGGGCC 
11101 GTACACGACC AGTTCCATCG CGAACAGCGA 
11161 GTCCACGGGC CAGTCCGACC TGGTCTTCGT 
11221 GGGGTCGTCC TTGACGGGTG CGGTCATGAG 
11281 CCGGTCTTCC GGCCGTGGTG TCCCTCGCGG 
11341 CTGCGCTCGT CGCCGGTGCG TTTGTGCAGC 
114 01 CCGATCAGGT CCGCGGTGCG CAGCGGCCCG 
114 61 GCGTCGACGT CCTCGACGGA CGCGGTGCCC 
11521 ATCGGGTGGA GCAGCCGGCT CGTGACGAAG 
11581 CGCCGCAGCG CCGCGAGCAG GTCCCCGGCG 
11641 CCGCGGATCA CCTCGACCGT CGGGATCAGG 
117 01 AGGTCCTCGG GCCGGGCCAC GGAGTCGGCC 
117 61 GTGATGACCG GGATACCGGG CGCCGCTGCC 
11821 TCGGCGTCCT CGACGACGGC CTCGATCACC 
11881 GACGTGGCCG TCCGCAGCAC ACCGGGG^CG 
11941 GTCCGCAGTT CGGTGGCGAT CCGCGCCCGC 
12001 ACGAGTGTCA CCGGGACGCC GTGGCGCAGC 
12 061 CCCGCGCCGA GCACGATCAG CTGGTGGTCC 
12121 GCAGCGAGTA CGGGTCGAGG ACGTCTTCCG 
12181 GGCCGAGTTC GTCGGCGAAG CCGAGCAGCA 
12241 TGCCCGTCGA GTCGAGGACG CTCAGGCTGT 
12301 CGCACAGGGC CGCCAGCGAC GGGCCGAGCT 
12361 CGGCGCGGGC CTGCCCCGGA TGGTCGACGC 
12421 GCAGTTCGGT CTTGCCCGGC TCGTCGGCGC 
124 81 GCGGCTCGGC GGGCAGCACC GGCCCTTTGC 
12541 CATCCGCGGC GGCGGCGGCC TCCGCCGGAT 
12 601 CGATGCGGTC CGCGAACGAC GCCGCGTGGC 
12 661 CGATGGGCAG GACCCTGCTG AGCGCGTGCG 
12721 TCAGCGTGAG CGTGGCGCTG TCGGACCGGG 
127 81 CGCCGGTCCG CATCGCGGTG ATCACGCCTG 
12841 CGTCGTCGAG GCGCGACATC GTGCCGACGA 
12901 GCGGACTGTA CGAAACCGTC TTCATGGTCA 
12961 ACTCGATGAC GCCGGGAATG TCGCCGCCGC 
13021 CGAACTCGCC GCGGCCGAGC GCGGCGAACC 
13081 TCATCACGTC GCGGCCGATC ACGGAGAGAA 
13141 TGGTCTGCAT GTGTCACC7C CCTTTCGTGG 
13201 CGGCTTCCGT TCTCATCGCA GCTCCCTGTC 
13261 GTCCGCGGAC AGCACGCCGG CCGGCGTGGT 
13321 CAGGGCGTCC AGCCGGGTTC CGATCGCGTC 
13381 AACGAGTGCT TCCAGCCGGT CGAGCTGCGC 
13441 CAGCAGTTCA CCGATGCGGT CGGCGAGTGC 
13501 GGCGGACAGT CGCAGACCGG TCGCCTCGTT 
13561 CGAGTCCACA CCGAGTTCCC GGAACGCCGC 
13621 GCCCAGGACG GCCGCTGCC7 TCTGCCGGAC 
13681 CTCGTTGCGG GCGCTCCGGC GGGCCGACGG 

137 41 CGGCGGCAGG TCGCCCGCCA CGGCGACGAC 
13801 GTACATGCGC ATGCCCTGTT CGGCGGTGAG 

138 61 CCGGTCGGCG TCGGTCAGGT CCGCGGTCAG 
13921 GATCGACAGC CCTGGCAGCC CTTGTGCACG 
13981 GTTCGCCGCC GCGTAGTTGC CCTGACCGGG 
14 041 GACGACGAAT GCGGCGAGGT CGGTGTCGCG 
14101 GGCCTTGGGT T T GAG G AC GG TGTCGATGCG 
14161 GTCGAGGGTT CCGGCGGTGT GGAAGACGGC 



GTGGACGAGT GCGACCATGG ACTCGACGGT 
GGCGATCGTG ACGTCGAACG TCTTCTCCAG 
CGTGAGGCCG CCCTCCGCGA ACAGGTCGCG 
CTTGAGGAAC GCGACCAACG CGTGCGCGAC 
AACACCTTCT CGTATTCGTA GAAGCCCCGG 
ACCTTGCCCA GCAGCAGGTC ACAGGGGCGG 
ACCCACAGCG CGTCGACGAG GTTGTCGATG 
GTCGGATGGC CGAGGCACCC CGTCATGAGC 
TCCTGCACGA TCCGCGCCGC GTCGTTGATC 
CCGGGCGCGT CCCGGACGAC GATCGGCTTG 
GCGGCCATGG CCTTCTCACC GGTCCGGGGT 
TACGACGGGT TCATGAAGTG CGTGCCGAGC 
AGTTCGTCAA CCGGGATCGA CGACGTGTTC 
GAGACCGTGG CGAGTACCTC CGCCTTGACC 
GCGGTGGCCG TACCGATCGC GGGCAGCGCG 
GCCTCGGCGG GCCCGGCCAC GAGTTGTGCC 
GCCGCCGTAA GGATCTCCTC GGACGTGTCG 
GCGAGCGTGG TGATGCCGGT GCCCATCACT 
ACGCTGTTTC CTCCCTCCGG GGTCACCATG 
GGGTCGACCC GATCGCGTCC TTGCGGCCGA 
CGTCGAACGC GATGTGGTCG GCGAACGCGC 
CCCGGTGGTC CGCCGCGGTG TCCGGTGCCG 
CGCGGTCCGG CAGTTGCTGG TACTCGCCCT 
AGATGAACGC GTCGTCGAGC AGGGTCTTCG 
CGATGGCGTT CACATGCAGG TGCGGCAGCC 
CCGAGGGCAC CGAGGTGACG GTGGACAGGA 
CGGTCACCTT GACCGGCAGT CCGAGGAACG 
CGGGGTCGGT GTCGCTGACC AGGATCCGCT 
CCTGGGTCAC CGCCTGTGCG CCCGCGCCGA 
CCAGCAGCCG GCTCGCGACG GCGGCGACCG 
CGTCGGCGAG GGCGGTCAGA CTGCCGCTGT 
TCGTCGGCAG CCGGAAGCGC GGATAGTTGT 
CGCCGACACC GGGGACCCGG TACGGCATGA 
GGACGAATCC GGTACGCGGC GGCGCCTCGG 
CGTCGTGCAG CTCGCTGATC AGCCGGTCCA 
TCCGCTTGAT GTCACGTTGG CGCAGGACCC 
CCGGAGCTGT CTTGGTGGTG CCGCTCGGGG 
GATGAGGTCG AAAATCTCGT CCGCGGTCGC 
CGGGCGGGTC TCCCGCCGCC AGCGGTTGAG 
CGCCTGGCGG GCGCCCGGGT CGACACCGGC 
GAGCACCACG GTCACCGGGT CGTCCGGGGA 
GCGCGGCGAC GGGTAGTCGA AGACGAGCGT 
GAGGCCGTTG CGCAGCTGCA CCGCGATGAG 
GTCCTCCGGG ATGTCCTCCG GGTCGGCGTG 
GAGGGCGAGC AGGTCGGTGG GGCGTTCCTG 
CTTGGGCCGG CCACGCAGCA GCGGGAGGTC 
ACTGCCCGTT CCGGTGTGGA CGGCGGCGTC 
CGCGCTCGCC CCACCCTTGC GCATACGGCG 
GCCACTCGCC TGGTCCCACA GCCCCCACGC 
CCGGTGTTCG GCGAGCGCGT CGAGGAACGC 
GGTGCCCAGC ACACCGGCCG CCGACGAGTA 
GGTGAGCCGG TGCAGGTGCC AGGCGGCGTC 
GTCGGGGGTG AGGTTGTCGA GCAGGGCGTC 
GGTGAGGGGT TGAGGGATGT GGGCGAGGGT 



dc- 176500 



PATENT 

AttyDkt: 3006220Q26QQ 



-33- 



14221 GGTGGCGAGT TGGTGGGGGT CGCCGACGTC GCAGGGGAGG TGGGTGCCGG GGGTGGTGTC 
14281 GGGGGGTGGG GTGCGGGAGA GGAGGTAGGT GTGGGGGTGG TTCAGGTGGC GGGCGAGGAT 
14 341 GCCGGCGAGG GTGCCGGAGC CGCCGGTGAT GACGACGGCC CCCTCGGGGT CCAGCGGCCG 
14 4 01 CGGGACCGTG AGGACGATCT TGCCGGTGTG CTCGCCGCGG CTCATGGTCG CCAGCGCCTC 
14 4 61 GCGGACCTGC CGCATGTCGT GCACCGTCAC CGGCAGCGGG TGCAGCACAC CGCGCGCGAA 
14 521 CAGGCCGAGC AGCTCCGCGA TGATCTCCTT GAGCCGGTCG GGCCCCGCGT CCATCAGGTC 
14 581 GAACGGTCGC TGGACGGCGT GCCGGATGTC CGTCTTCCCC ATCTCGATGA ACCGGCCACC 
14 641 CGGCGCGAGC AGGCCGACGG ACGCGTCGAG GAGTTCACCG GTGAGCGAGT TGAGCACGAC 
14 701 GTCGACCGGC GGGAACGCGT CGGCGAACGC GGTGCTGCGG GAATCGGCCA GATGCGCTCC 

147 61 GTCCAGGTCC ACCAGATGGC GCTTCGCGGC GCTGGTGGTC GCGTACACCT CCGCGCCCAG 
14 821 GTGCCGCGCG ATCTGCCGGG CGGCGGAACC GACACCGCCG GTGGCCGCGT GGATCAGGAC 

148 81 CTTCTCGCCG GGGCGCAGCC CGGCGAGGTC GACCAGGCCG TACCACGCGG TCGCGAACGC 

14 341 GGTCATCACG GACGCCGCCT GCGGGAACGT CCAGCCGTCC GGCATCCGGC CGAGCATCCG 
15001 GTGGTCGGCG ATGACCGTGG GGCCGAAGCC GGTGCCGACG AGGCCGAAGA CGCGGTCGCC 
150 61 CGGTGCCAGA CCGGAGACGT CGGCGCCGGT CTCCAGGACG ATGCCCGCGG CCTCGCCGCC 
15121 GAGCACGCCC TGACCGGGGT AGGTGCCGAG CGCGATCAGC ACATCGCGGA AGTTGAGGCC 
15181 CGCCGCACGC ACACCGATCC GGACCTCGGC CGGGGCGAGG GGGCGCCGGG GCTCCGCCGA 
15241 GTCGGCCGCG GTGAGGCCGT CGAGGGTGCC CGTCCGCGCC GGCCGGATCA GCCACGTGTC 
15301 GCTGTCCGGC ACGGTGAGCG GCTCCGGCAC CCGGGTGAGG CGGGCCGCCT CGAACCGGCC 
15361 GCCGCGCAGC CGCAGACGCG GCTCGCCGAG TGCGACGGCG ATGCGCTGCT GCTCGGGGGC 
154 21 GAGCGTGACG CCGGACTCGG TCTCGACGTG GACGAACCGG CCGGGCTGCT CGGCCTGGGC 
15481 GGCGCGCAGC AGTCCGGCCG CCGCGCCGGT GGCGAGGCCC GCGGTGGTGT GCACGAGCAG 
15541 ATCCCCGCCG GAGCCGGTCA GGGCGGTCAG CAGCCGGGTG GTGAGCGCAC GCGTCTCGGC 

15 601 CACCGGGTCG TCGCCATCAG CGGCAGGCAA CGTGATGACG TCCACGTCGG TCGCGGGGAC 
15 6 61 ATCCGTGGGT GCGGCGACCT CGATCCAGGT GAGACGCATC AGGCCGGTGC CGACGGGTGG 
157 21 GGACAGCGGG CGGGTGCGGA CCGTCCGGAT CTCGGCGACG AGTTGGCCGG CGGAGTCGGC 
15781 GACGCGCAGA CTCAGCTCGT CGCCGTCACG AGTGATCACG GCTCGGAGCA TGGCCGAGCC 
15841 CGTGGCGACG AACCGGGCCC CCTTCCAGGC GAACGGCAGA CCCGCAGCGC TGTCGTCCGG 
15901 CGTGGTGAGG GCGACGGCGT GCAGGGCCGC GTCGAGCAGC GCCGGATGCA CACCGAAACC 
15961 GTCCGCCTCG GCGGCCTGCT CGTCGGGCAG CGCCACCTCG GCATACACGG TGTCACCATC 
16021 ACGCCAGGCA GCCCGCAACC CCTGGAACGC CGACCCGTAC TCATAACCGG CATCCCGCAG 
16081 TTCGTCATAG AACCCCGAGA CGTCGACGGC CACGGCCGTG ACCGGCGGCC ACTGCGAGAA 
16141 CGGCTCCACA CCGACAACAC CGGGGGTGTC GGGGGTGTCG GGGGTCAGGG TGCCGCTGGC 
16201 GTGCCGGGTC CAGCTGCCCG TGCCCTCGGT ACGCGCGTGG ACGGTCACCG GCCGCCGTCC 
16261 GGCCTCATCA GCCCCTTCCA CGGTCACCGA CACATCCACC GCTGCGGTCA CCGGCACCAC 
16321 AAGGGGGGAT TCGATGACCA GCTCGTCCAC TATCCCGCAA CCGGTCTCGT CACCGGCCCG 
16381 GATGACCAGC TCCACAAACG CCGTACCCGG CAGCAGGACC GTGCCCCGCA CCGCGTGATC 
16441 AGCCAGCCAG GGGTGAGTGC GCAATGAGAT CCGGCCAGTG AGAACAACAC CACCATCGTC 
16501 GGCGGGCAGC GCTGTGACAG CGGCCAGCAT CGGATGCGCC GCACCCGTCA ACCCCGCCGC 
16561 CGACAGATCG GTGGCACCGG CCGCCTCCAG CCAGTACCGC CTGTGCTCGA ACGCGTACGT 
16621 GGGCAGATCC AGCAGCCGTC CCGGCACCGG TTCGACCACC GTGTCCCAGT CCACTGCCGT 
16681 GCCCAGGGTC CACGCCTGCG CCAACGCCGT CAGCCACCGC TCCCAGCCGC CGTCACCGGT 
16741 CCGCAACGAC GCCACCGTGT GAGCCTGCTC CATCGCCGGC AGCAGCACCG GATGGGCACT 
16801 GCACTCCACG AACACCGACC CATCCAGCTC CGCCACCGCC GCGTCCAACG CCACCGGACG 
16861 ACGCAGATTC CGGTACCAGT ACCCCTCATC CACCGGCTCC GTCACCCAGG CGCTGTCCAC 
16921 GGTCGACCAC CACGCCACCG ACGCGGCCTT CCCTGCCACC CCCTCCAGTA CCTTGGCCAG 
16981 TTCATCCTCG ATGGCTTCCA CGTGGGGCGT GTGGGAGGCG TAGTCGACCG CGATACGACG 
17041 CACCCGCACG CCTTCGGCCT CATACCGCGC CACCACCTCC TCCACCGCCG ACGGGTCCCC 
17101 CGCCACCACC GTCGAAGCCG GGCCGTTACG CGCCGCGATC CACACACCCT CGACCAGACC 
17161 GACCTCACCG GCCGGCAACG CCACCGAAGC CATCGCTCCC CGCCCGGCCA GTCGCGCCGC 
17221 GATGACCTGA CTGCGCAATG CCACCACGCG GGCGGCGTCC TCGAGGCTGA GGGCTCCGGC 
17281 CACGCACGCC GCCGCGATCT CGCCCTGGGA GTGTCCGATC ACCGCGTCCG GCACGACCCC 
17 341 ATGCGCCTGC CACAGCGCGG CCAGGCTCAC CGCGACCGCC CAGCTGGCCG GCTGGACCAC 
17 4 01 CTCCACCCGC TCCGCCACAT CCGGCCGCGC CAACATCTCC CGCACATCCC AGCCCGTGTG 
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174 61 CGGCAGCAAC GCCTGAGCGC ACTCCTCCAT ACGCGCGGCG AACACCGCGG AGTGGGCCAT 
17521 GAGTTCCACG CCCATGCCGA CCCACTGGGC GCCCTGGCCG GGGAAGACGA ACACCGTACG 
17581 CGGCTGGTCC ACCGCCACAC CCGTCACCCG GGCATCGCCC AGCAGCACCG CACGGTGACC 

17 641 GAAGACAGCA CGCTCCCGCA CCAACCCCTG CGCGACCGCG GCCACATCCA CACCACCCCC 
17701 GCGCAGATAC CCCTCCAGCC GCTCCACCTG CCCCCGCAGA CTCACCTCAC CACGAGCCGA 
177 61 CACCGGCAAC GGCACCAACC CGTCAACAAC CGACTCCCCA CGCGACGGCC CAGGAACACC 
17821 CTCAAGGATC ACGTGCGCGT TCGTACCGCT CACCCCGAAC GACGACACAC CCGCATGCGG 
17881 TGCCCGATCC GACTCGGGCC ACGGCCTCGC CTCGGTGAGC AGCTCCACCG CACCGGCCGA 
17941 CCAGTCCACA TGCGACGACG GCTCGTCCAC ATGCAGCGTC TTCGGCGCGA TCCCGTACCG 
18001 CATCGCCATG ACCATCTTGA TCACACCGGC GACACCCGCC GCCGCCTGCG CATGACCGAT 
18061 GTTCGACTTC AACGAACCCA GCAGCAGCGG AACCTCACGC TCCTGCCCGT ACGTCGCCAG 
18121 AATGGCCTGC GCCTCGATGG GATCGCCCAG CGTCGTCCCC GTCCCGTGCG CCTCCACCAC 
18181 GTCCACATCG GCGGCGCGCA GTCCGGCGTT CACCAACGCC TGCTGGATGA CACGCTGCTG 
18241 GGACGGGCCG TTGGGGGCGG ACAGCCCGTT GGAGGCACCG TCCTGGTTCA CCGCCGACCC 
18301 GCGGACGACC GCGAGAACGG TGTGTCCGTT GCGCTCGGCG TCGGAGAGCC GCTCCAGCAC 
18361 AAGAACGCCG GCGCCCTCCG CCCAGCCGGT GCCGTTGGCG GCGTCCGCGA ACGCGCGGCA 
184 21 GCGGCCGTCG GGGGAGAGTC CGCCCTGCTG CTGGAATTCC ACGAACCCGG TCGGGGTCGC 
184 81 CATGACGGTG ACACCGCCGA CCAGCGCCAG CGAGCACTCC CCGTGGCGCA GTGCGTGCCC 
18541 GGCCTGGTGC AGCGCGACCA GCGACGACGA GCACGCCGTG TCCACCGTGA ACGCCGGTCC 
18601 CTGGAGCCCA TAGAAGTACG AGATCCGGCC GGTGAGCACG CTGGGCTGCA TGCCGATCGA 

18 661 GCCGAACCCG TCCAGGTCCG CGCCGACGCC GTACCCGTAC GAGAAGGCGC CCATGAACAC 
18721 GCCGGTGTCG CTGCCGCGCA GTGTGCCCGG CACGATGCCC GCGCTCTCGA ACGCCTCCCA 
187 81 TGTCGTTTCC AGCAGGATCC GCTGCTGGGG GTCCATGGCC CGTGCCTCAC GGGGGCTGAT 
18841 GCCGAAGAAC GCGGCATCGA AGCCGGCGGC GTCGGAGAGG AAGCCGCCGC GGTCCGTGTC 
18901 CGATCCGCCG GTGAGGCCGG ACGGGTCCCA GCCACGGTCG GCCGGGAAGC CGGTGACCGC 
18 961 GTCGCCGCCA CTGTCCACCA TGCGCCACAG GTCGTCGGGC GAGGTGACGC CGCCCGGCAG 
19021 TCGGCAGGCC ATGCCCACGA TGGCCAGCGG TTCGTCACGG GTCGCGGCGG CTGTGGGAAC 
19081 AGCGACCGGT GCGGCACCAC CGACCAGAGC CTCGTCCAAC CGCGACGCGA TGGCCCGCGG 
19141 CGTCGGGTAG TCGAAGACAA GCGTGGCGGG CAGTCGGACA CCGGTCGCCG CGGCGAGTCG 
19201 GTTCCGCAGT TCGACGGCGG TCAGCGAGTC GATACCCAGT TCCTTGAAGG CCGCGTCCGC 
19261 GGACACGTCC GCGGCGTCCG CGTGGCCGAG CACCGCCGCC GCGTTGTCGC GGACCAGTGC 
19321 CAGCAGCGCG GTGTCCCGCT CAGCGCCGGA CATGGTGCCG AGCCGGTCGG CGAGCGGAAC 
19381 GGCGGTGGCC GCCGCCGGGC GCGATACGGC GCGGCGCAGA TCGGCGAAAA GCGGCGATGT 
19441 GTGCGCGGTG AGGTCCATCG TGGCCGCCAC GGCGAACGCG GTGCCGGTTC CGGCCGCGGC 
19501 TTCCAGCAGG CGCATGCCCA CACCGGCCGA CATGGGGCGG AAACCGCCGC GGCGGACACG 
19561 GGTGCGGTTG GTGCCGCTCA TGCTGCCGGT GAGTCCGCTG TCATCGGCCC AGAGGCCCCA 
19621 GGCCAGCGAC AGCGCGGGCA GTCCTTCGGC ATGGCGCAGC GTCGCGAGTC CGTCGAGGAA 
19681 CCCGTTCGCC GCCGAGTAGT TGCCCTGGCC GCGGCCGCCC ATGATGCCCG CGACGGACGA 
19741 GTAGAGGACG AACGAGCGCA GGTCCGCGTC CCGGGTCAGC TCGTGCAGGT GCCAGGCGCC 
19801 GTCGGCTTTG GGGCGCAGTG TGGTGGCGAG CCGCTCCGGG GTGAGTGCCG TGGTCACGCC 
198 61 GTCGTCGAGC ACGGCTGCCG TGTGGAAGAC CGCCGTGAGC GGCCTGCCGG CGGCGGCGAG 
19921 CGCGGCGGCG AGCTGGTCCC GGTCGGCGAC GTCACAGCGG ATGTGGACAC CGGGAGTGTC 
19981 CGCCGGCGGT TCGCTGCGCG ACAGCAACAG GAGGTGGCGG GCGCCATGCT CGGCGACGAG 
20041 ATGCCGGGCG AGGAGACCTG CCAGCACACC CGAGCCGCCG GTGATGACCA CCGTGCCGTC 
20101 CGGGTCGAGC AGCGGTTCGG GCGTTTCCGC GGCGGCCGTG CGGGTGAACC GCGGCGCTTC 
20161 GTACCGGCCG TCGGTGACGC GGACGTACGG CTCGGCCAGT GTCGTGGCGG CGGCCAGCGC 
20221 CTCGATGGGG GTGTCGGTGC CGGTCTCCAC CAGCACGAAC CGGCCCGGGT GCTCGGCCTG 
20281 GGCGGACCGG ACGAGGCCGG CGACCGCTCC TCCGACCGGT CCCGCGTCGA TCCGGACGAC 
20341 GAGGGTGGTC TCCGCAGGGC CGTCCTCGGC GATCACCCGG TGCAGCTCGC CGAGCACGAA 
20401 CTCGGTGAGC CGGTACGTCT CGTCGAGGAC ATCCGCGCCC GGTTCCGGGA GCGCGGAGAC 
204 61 GATGTGGACC GCGTCCGCAG GACCGGGCCC GGGAGTGGGC AGCTCGGTCC AGGAGAGGCC 
20521 GTACAAGGAG TTCCGTACGA CGGCGGCGTC GCCGTCGACG TTCACCGGTC GCGCGGTCAG 
20581 CGCGGCGACG GTCACCACCG GTTGGCCGAC CGGGTCCGTC GCATGCACGG CAGCGCCGTC 
20641 CGGGCCCTGA GTGATCGTGA CGCGCAGCGT GGTGGCCCCG GTCGTGTGGA ACCGCACGCC 
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20701 GCTCCACGAG AACGGCAGCC GCACCTCCGC TTCCTGTTCC GCGAGCAGCG GCAGGCAGGT 
20761 GACGTGCAAG GCCGCGTCGA ACAGCGCCGG GTGGACGCCA TAGTGCGGCG TGTCGTCCGC 
20821 CTGTTCCCCG GCGATCTCCA CCTCGGCGTA CAGGGTTTCG CCGTCGCGCC AGGCGGTGCG 
20881 CAGTCCCTGG AACGCTGGGC CGTAGCTGTA GCCGGTCTCG GCCAGCCGCT CGTAGAACGC 
20941 GCTCACGTCG ACGCGTCGCG CGCCCGGCGG CGGCCACGCG GGCGGCGGGA CCGCCGCGAC 
21001 GCTTCCGGCC CGGCCGAGGG TGCCGCTGGC GTGCCGGGTC CAGCTGTCCG TGCCCTCGGT 
21061 ACGCGCGTGG ACGGTCACTC GCCGCCGTCC GGCCTCATCG GCCCCTTCGA CGGTCACCGA 
21121 CACATCCACC GCGCCGGTCA CCGGCACCAC GAGCGGGGTC TCGATGACCA GTTCATCCAC 
21181 CACCCCGCAA CCGGTCTCGT CACCGGCCCG GATGACCAGC TCCACAAACG CCGTACCCGG 
21241 CAGCAGAACC GTGCCCCGCA CCGCGTGATC AGCCAGCCAG GGATGCGTAC GCAACGAGAT 
21301 CCGGCCAGTG AGAACAACAC CACCACCGTC GTCGGCGGGC AGTGCTGTGA CGGCGGCCAG 

213 61 CATCGGATGC GCCGCCCCGG TCAGCCCGGC CGCGGACAGA TCGGTGGCAC CGGCCGCCTC 
21421 CAGCCAGTAC CGCCTGTGCT CGAACGCGTA GGTGGGCAGA TCGAGCAGCC GTCCCGGCAC 

214 81 CGGTTCGACC ACCGTGTCCC AGTCCACTGC CGTGCCCAGG GTCCACGCCT GCGCCAACGC 
21541 CGTCAGCCAC CGCTCCCAGC CGCCGTCACC GGTCCGCAAC GACGCCACCG TGTGAGCCTG 
21601 TTCCATCGCC GGCAGCAGCA CCGGATGGGC GCTGCACTCC ACGAACACGG ACCCGTCCAG 
21661 CTCCGCCACC GCCGCGTCCA GCGCGACGGG GCGACGCAGG TTCCGGTACC AGTAGCCCTC 
21721 ATCCACCGGC TCGGTCACCC AGGCGCTGTC CACCGTGGAC CACCAGGCCA CCGACCCGGT 
21781 CCCGCCGGAA ATCCCCTCCA GTACCTCGGC CAACTCGTCC TCGATGGCTT CCACGTGGGG 
218 41 CGTGTGGGAG GCGTAGTCGA CCGCGATACG GCGCACTCGC ACGCCTTCGG CCTCGTACCG 
21901 CGTCACCACT TCTTCCACCG CGGACGGGTC CCCCGCCACC ACAGTCGAAG ACGGGCCGTT 
21961 ACGCGCCGCG ATCCACACGC CCTCGACCAG GTCCACCTCA CCGGCCGGCA ACGCCACCGA 
22021 AGCCATCGCC CCCCGCCCGG CCAGCCGCCC GGCGATCACC TGGCTGCGCA AGGCCACCAC 
22081 GCGGGCGGCG TCCTCAAGGC TGAGGGCTCC GGCCACACAC GCCGCCGCGA TCTCGCCCTG 
22141 GGAGTGTCCG ACCACCGCGT CCGGCACGAC CCCATGCGCC TGCCACAGCG CGGCCAGGCT 
22201 CACCGCGACC GCCCAGCTGG CCGGCTGGAC CACCTCCACC CGCTCCGCCA CATCCGGCCG 
222 61 CGCCAACATC TCCCGCACAT CCCAGCCCGT GTGCGGCAAC AACGCCCGCG CACACTCCTC 
22321 CATACGAGCC GCGAACACCG CAGAACACGC CATCAACTCC ACACCCATGC CCACCCACTG 
22381 AGCACCCTGC CCGGGAAAGA CGAACACCGT ACGCGGCTGA TCCACCGCCA CACCCATCAC 
224 41 CCGGGCATCG CCCAACAACA CCGCACGGTG ACCGAAGACA GCACGCTCAC GCACCAACCC 
22501 CTGCGCGACC GCGGCCACAT CCACACCACC CCCGCGCAGA TACCCCTCCA GCCGCTCCAC 
22561 CTGCCCCCGC AGACTCACCT CACTCCGAGC CGACACCGGC AACGGCACCA ACCCATCGAC 
22621 AGCCGACTCC CCACGCGACG GCCCGGGAAC ACCCTCAAGG ATCACGTGCG CGTTCGTACC 
22681 GCTCACCCCG AAAGCGGAGA CACCGGCCCG GCGCGGACG'T CCCGCGTCGG GCCACGCCCG 
22141 CGCCTCGGTG AGCAGTTCCA CCGCGCCCTC GGTCCAGTCC ACATGCGACG ACGGCTCGTC 
22801 CACATGCAGC GTCTTCGGCG CGATGCCATA CCGCATCGCC ATGACCATCT TGATGACACC 
228 61 GGCGACACCC GCAGCCGCCT GCGCATGACC GATGTTCGAC TTCAACGAAC CCAGCAGCAG 
22 921 CGGAACCTCA CGCTCCTGCC CGTACGTCGC CAGAATCGCG TGCGCCTCGA TGGGATCGCC 
22 981 CAGCGTCGTC CCCGTCCCGT GCGCCTCCAC CACGTCCACG TCGGCGGGGG CGAGCCCCGC 
23041 CTTGTGGAGG GCCTGGCGGA TGACGCGCTG CTGGGAGGGG CCGTTGGGTG CGGAGATGCC 
23101 GTTGGAGGCG CCGTCCTGGT TGACGGCGGA GGAGCGGACG ACCGCGAGGA CGGTGTGTCC 
23161 GTTGCGCTCG GCGTCGGAGA GCTTTTCGAC GACGAGGACG CCGGCCCCCT CGGCGAAACC 
23221 GGTGCCGTCC GCCGCGTCAG CGAACGCCTT GCACCGTCCG TCCGGCGCGA CGCCGCCCTG 
23281 CCGGGAGAAC TCCACGAAGG TCTGTGGTGA TGCCATCACT GTGACACCAC CGACCAGCGC 
23341 CAGCGAGCAC TCCCCGGTCC GCAGCGCCTG CCCGGCCTGG TGCAGCGCGA CCAGCGACGA 
234 01 CGAACACGCC GTGTCGACCG TGACCGCCGG ACCCTCCATG CCGAAGAAGT ACGACAGCCG 
234 61 TCCGGCGAGC ACCGCGGGCT GTGTGCTGTA GGCGCCGAAT CCGCCCAGGT CCGCGCCCGT 
23521 GCCGTAGCCG TAGTAGAAGC CGCCGACGAA GACGCCGGTG TCGCTGCCGC GCAGGGTGTC 
23581 CGGCACGATG CCGGCGTGTT CGAGCGCCTC CCAGGCGATT TCGAGGAGGA TCCGCTGCTG 
23641 CGGGTCGAGT GCGGTGGCCT CGCGCGGACT GATGCCGAAG AACGCGGCAT CGAAGTCGGC 
2 3701 GGCGCCCGCG AGTGCGCCGG CCCGCCCGGT GGCGGACTCG GCGGCGGCGT GCAGCGCGGC 
237 61 CACGTCCCAG CCGCGGTCGG TGGGGAAGTC GCCGATCGCG TCGCGGCCGT CCGCGACGAG 
23821 CTGCCACAGC TCTTCCGGTG AGGTGACGCC GCCCGGCAGT CGGCAGGCCA TGCCGACGAC 
23881 GGCGAGCGGC TCGTTCGCCG CGGCGCGCAG CGCGGTGTTC TCCCGGCGGA GCTGCGCGTT 
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23 941 GTCCTTGACC GACGTCCGCA GCGCCTCGAT CAGGTCGTTC TCGGCCATCG CCTCATCCCT 

24 001 TCAGCACGTG CGCGATGAGC GCGTCTGCGT CCATGTCGTC GAACAGTTCG TCGTCCGGCT 
24 061 CCGCGGTCGT GGTGCTCGCG GGTGCCTGTG CCGGTGGTTC ACCGCCGTCC GGGGTCCCGT 
24121 TGTCGTCCGG GGTCCCGTTG ACG'TCCGGGG CCAGGAGGGT CAGCAGATGA CGGGTGAGCG 

5 24181 CGCCGGCGGC GGGATAGTCG AAGACGAGCG TGGCCGGCAG CGGAATGCCG AGGGCCTCGG 
24241 AGAGCCGGTT GCGCAGGCCG AGCGCGGTGA GCGAGTCGAC CCCGAGGTCC TTGAACGCCG 
24 301 TGGTGGCCGT GACCGCCGCC GCGTCGGTGT GGCCCAGCAG GGTGGCGGCG GTGTCGCGGA 
24361 CGACGCCGAG CAGCACCTGT TCCCGTTCCT TGTGGGGCAG GTCCGGCAGG CGTTCCAGCA 
24 4 21 GGGAGCCGCC GTCGGTCGCG GAGCGCCGGG TGGGGCGCTG GATCGGTCGC CACAGCGGTG 

10 24 4 81 ACGGGTCGCC GGGCCCGGGT GGGGCGGTCG CCACGACCAC GGCTTCCCCG GTGGCGCACG 
24 541 CGGCGTCGAG GAGGTCGGTC AGCCGGTCCG CCGCGGCGGT GAACGCCACG GCCGGCAGGC 
24 601 CTTGTGCCCG GCGCAGGTCG GCCAGGGCCT GGAGCGGTCC GGCCGCCTCG CCGGACGGAA 
24 661 CGGCGAGAAC GAACGCGGTC AGGTCGAGGT CGCGGGTCAG GCGGTGCAGT TCCCAGGCCG 
24721 ACTCGGCGGT GCCGTCCGCG TGGACGACCG CGGTCACCGG GGTTTCCGGC ACTGTGCCCG 

15 24 781 GCTCGTACCG GATCACTTCG GCGCCGTGTC CGCCGAGGTG TCCGGCGAGT TCCTCCGAAC 
24 841 CGCCCGCGAG GAGGACGGTG TCGCCGTACG AGGCCGCGGC CGTGGTGGGC GCGGCGGGGA 
24 901 CGAGGCGGGG CGCTTCGAGG CGCCCGTCGG CCAGGCGCAG GTGCGGTTCG TCGAGGCGGG 

24 961 AGAGGGCGGC GGCGCGGCGG GGGGTGACCG TGTCGGTGGT CTCCACGAGC ACGAGCCGGC 
25021 ccggt:tccgc GGTGTCGAGC AGTGCGGCGA CGGCACCGGC GACGGGCCCG GCCTCGGCGG 

20 25081 ACACCACCAG CGTGGCGCCG GCGGTCCTCG GGTCGTCCAG TGCGGTACGG ACCTCGTCGG 
25141 GACCGGATAC CGGGACGACG ATGACGTCGG GCGTGGCGTC GTCGCCGAGG TCGGTGTACC 
25201 GGCGGGCCGT GGTGCCGGGT GCCGCCGGGG CCCGGACGCC GGTCCAGGTG CGCCGGAACA 
25261 GCCGCACGTC CCCGTCCGGG CCCGTCGTGG CGGGGGGCCG GGTGATGAGC GAGCCGATCT 
25321 GAGCCACCGG CCGTCCCAGT TCGTCGGCGA GGTGCACGCG GGCGCCGCCC TCGCCCTCGC 

25 25381 CGTGGACGAA GGTGACGCGC AGTTTCGTGG CGCCGCTGGT GTGGACACGG ACGCCGGTGA 
254 41 ACGCGAACGG CAACCGTACC CCCGCGTTCT CGGCGGCCGC GCCGATGCTG CCCGCTTGCA 
25501 GCGCGGTGAC GAGCAGCGCC GGGTGCAGTG TGTAGCGGGC GGCGTCCCTG GCGAGGGCGC 
25561 CGTCGAGGGC GACTTCGGCG CAGACGGTGT CTCCGTGGCT CCACGCGGCG GACATGCCGC 
25621 GGAACTCGGG GCCGAACTCG TATCCCGCGT CGTCGAGTCG CTGGTAGAAG GCCGCGACGT 

30 25681 CGACCGGTTC CGCGTGCTCG GGCGGCCAGG GCCCCGGCGT GGTGGCCGGT TCGGTGGTGG 
25741 CGATGCCGGC GAAGCCGGAG GCGTGGCGGG TCCATGTCCG GTCGCCGTCC GTCCGGGCGT 
25801 GGACGCGCAC GGCACGGCGT CCGGTGTCGT CGGGCGCGGC GACGGTCACG CGCACCTGGA 
258 61 CGGCGCCGGT GGCGGGCAGG ACCAGCGGTG TCTCGACGAC CAGTTCGTCG AGCAGGTCGC 

25 921 AGCCTGCCTC GTCGGCGCCG CGTCCGGCCA ATTCCAGGAA GGCGGGUCCG GGCAGCAGTA 
35 25981 CGGCGCCGTC GACGGAGTGA CCGGCCAGCC ATGGGTGGGT GGCCAGCGAG AACCGGCCGG 

26041 TGAGCAGCAC CTCGTCGGAG TCGGGGAGCG CCACCGACGC GGCGAGCAGC GGGTGGTCGA 
2 6101 CGGCGTCGAG TCCGAGGCCG GAAGCGTCCG TGCCGGCCGC GGTCTCGATC CAGTAGCGCT 
2 6161 CATGGTGGAA GGCGTATGTG GGCAGGTCGT GTGCCGTCGC CGTCGCGGGG ACGACCGCCG 
2 6221 CCCAGTCGAC GGGCACGCCG GTTGTGTGCG CCTCGGCCAG CGCGGTGAGC AGCCGGTGGA 

40 2 6281 CTCCCCCGCC GCGGCGGAGC GTGGCGACGG TCGCGCCGTC GATCGCGGGC AGCAGCACGG 
2 6341 GGTGCGCGCT GACCTCGACG AACACGGTGT CACCCGGCTC GCGGGCAGCG GTCACGGCCG 
2 6401 TGGCGAAGCC TACGGGGTGG CGCATGTTGC GGAACCAGTA CTCGTCGTCG AGCGGCGCGT 
264 61 CGATCCAGCG TTCGTCGGCG GTGGAGAACC ACGGGATCTC GGGCGTGCGC GAGGTGGTGT 
2 6521 CCGCGACGAT CCGCTGGAGT TCGTCGTACA GCGGGTCGAC GAACGGGGTG "IGGGTCGGGC 

45 2 6581 AGTCGACGGC GATGCGGCGC ACCCAGACGC CGCGGGCCTC GTAGTCGGCG ATCAGCGTTT 
2 6641 CGACGGCGTC CGGGCGCCCG GCGACGGTCG TGGTGGTGGC GCCGTTGCGG CCCGCGACCC 
2 6701 AGACGCCGTC GATCCGGGCG GCATCCGCCT CGACGTCGGC GGCCGGGAGC GCGACCGAGC 
2 6761 CCATCGCGCC GCG'ICCGGCG AGTTCGCGCA G GAG C AG GAG AACGCTGCGC AGCGCGACGA 
2 6821 GGCGGGCACC GTCCTCCAGG GTGAGCGCTC CGGCGACACA GGCCGCGGCG ATCTCGCCCT 

50 2 6881 GGGAGTGTCC GATGACGGCG TCCGGGCGTA CGCCCGCGGC CTCCCACACG GCGGCCAGCG 
26941 ACACCATGAC GGCCCAGCAG ACGGGGTGCA CGACGTCGAC GCGGCGGGTC ACCTCCGGGT 
27 001 CGTCGAGCAT GGCGATGGGG TCCCAGCCCG TGTGCGGGAT CAGCGCGTCG GCGCATTGGC 
27061 GCATCCTGGC GGCGAACACC GGGGAGGCCG CCATCAGTTC GACGCCCATG CCGCGCCACT 
2 7121 GCGGTCCTTG TCCGGGGAAG ACGAAGACGG TGCGCGGCTC GGTGAGCGCC GTGCCGGTGA 
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27181 CGACGTCGTC GTCGAGCAGC ACGGCGCGGT GCGGGAACGT CGTACGCCTG GCGAGCAGGC 
27241 CCGCGGCGAT GGCGCGCGGG TCGTGGCCGG GACGGGCGGC GAGGTGCTCG CGGAGTCGGC 
27301 GGACCTGGCC GTCGAGGGCC GTGGCGGTCC GCGCCGAGAC GGGCAGTGGT GTGAGCGGCG 
27361 TGGCGATCAG CGGCTCACCG GGCTTCGAGG CCGACGGCTC CTCGGCCGGC GGCTCCCCGG 
27421 CCGGGTGGGC TTCCAGCAGG ACGTGGGCGT TGGTGCCGCT GACGCCGAAG GAG G AC AC AC 
27481 CGGCGCGCCG CGGGCGGTCG GTCTCGGGCC AGGGCCGGGC ATCGGTGAGG AGTTCGACGG 
27541 CGCCGGCCGT CCAGTCGACG TGCGAGGACG GCGTGTCCAC GTGCAGGGTG CGCGGCAGGG 
27601 TGCCGTGCCC- CATGGCGAGG ACCATCTTGA TGACACCGGC GACACCCGCG GCGGCCTGAG 
27 661 TGTGGCCGAT GTTGGACTTC AGCGAGCCCA GCAGCACCGG GGTGTCGCGC CCCTGCCCGT 
27721 AGGTGGCCAG CACCGCCTGT GCCTCGATGG GATCGCCCAG CCTGGTGCCG GTGCCGTGCG 
27781 CCTCCACGGC GTCCACGTCC GCCGGGGTGA GCCCGGCGTT GGCCAGGGCC TGCCGGATCA 
2 7841 CCCGCTCCTG CGAGGGCCCG TTCGGCGCCG ACAACCCGTT GGAAGCACCG TCCTGGTTGA 
27901 CCGCCGAACC CCGGACAACC GCCAGCACAC GGTGGCCGTT GCGCTCGGCA TCGGAGAGCC 

27 961 TCTCGACGAT CAGCACACCG GACCCCTCGG CGAAACCGGT GCCGTCAGCC GCATCCGCGA 
28021 ACGCCTTGCA GCGCGCGTCG GGCGCGAGAC CCCGCTGCTG GGAGAACTCG ACGAAGCCGG 
28081 ACGGCGAGGC CATCACCGTG ACGCCGCCGA CCAGGGCGAG CGAGCATTCG CCGGAGCGCA 
28141 GTGACTGCCC GGCCTGGTGC AGCGCCACCA GCGACGACGA ACACGCCGTG TCGACCGTGA 
28201 CCGCCGGACC CTCCAGACCG TAGAAGTACG ACAGCCGACC GGACAGCACA CTGGTCTGGG 
28261 TGCCGGTCGC GCCGAAACCG CCCAGGTCGG TGCCGAGTCC GTACCCGTCG GAGAAGGCGC 
28321 CCATGAACAC GCCGGTGTCG CTTCCGCGCA GCGACTCCGG GAGGATCCCG GCGTGTTCCA 
28381 GCGCCTCCCA CGAGGTCTCC AGGACCAGAC GCTGCTGCGG GTCCATCGCC AGCGCCTCAC 
284 41 GCGGACTGAT CCCGAAGAAC GCCGCGTCGA AGTCCGCCAC CCCGGCGAGG AAGCCACCAT 
28501 GACGCACGGT CGACGTGCCC GGATGATCCG GATCGGGATC GTACAGCCCG TCCACGTCCC 
28561 AACCACGGTC CGTCGGAAAC GCCGTGATCC CGTCACCACC CGACTCCAGC AGCCGCCACA 

28 621 AGTCCTCCGG CGACGCGACC CCACCCGGCA GCCGGCAGGC CATCCCCACG ATCGCCAACG 
28 681 GCTCGTCCTG CCGGACGGCC GCGGTCGTGG TGCGGGTCGG CGATGCCGTC CGGCCGGACA 
28741 GCGCCGCGGT GAGCTTCGCC GCGACGGCGC GCGGCGTCGG GAAGTCGAAG ACCGCGGTGG 
28801 CGGGCAGCCG TACGCCCGTC GCCTCGGTGA AGGCGTTGCG CAGCCGGATC GCCATGAGCG 
28861 AGTCGACGCC GAGTTCCTTG AACGTGGCGG TCGCCTCGAC CCGTGCGGCA CCGTCGTGGC 
28 921 CGAGTACGGC CGCGGTGCAC TGCCGGACGA CGGCGAGCAC GTCCTTTTCG GCGTCCGCGG 
28981 CGGAGAGCCG CGCGATCCGG TCGGCGAGGG TGGTGGCGCC GGCCGCCCGG CGCCGCGGCT 
29041 CCCGGCGCGG TGCGCGCAGC AGGGGCGAGC TGCCGAGGCC GGCCGGGTCG GCGGCGACCA 
29101 GCGCCGGGTC CGAGGACCGC AACGCCGCGT CGAACAGCGT CAGTCCGCCT TCGGCGGTCA 
29161 GCGCCGTCAC GCCGTCGCGG CGCATGCGGG CGCCGGTGCC GACCGTCAGC CCGCTCTCCG 
29221 GTTCCCACAG GCCCCAGGCC ACGGACAACG CGGGCAGTCC GGCTGCCCGG CGCTGTTCGG 
29281 CCAGCGCGTC GAGGAACGCG TTCGCGGCCG CGTAGTTGCC CTGTCCGGGG CTGCCGAGCA 
29341 CACCGGCGGC CGACGAG7AG AGGACGAACG CGGCCAGTTC CGTGTCCTGG GTGAGTTCGT 
29401 GCAGGTGCCA CGCGGCGTCC ACCTTCGGGC GCAGCACCGT CTCGAGCCGG TCGGGGGTGA 
294 61 GCGCGGTGAG GACGCCGTCG TCGAGGACGG CCGCGGTGTG CACGACGGCC GTGAGCGGGT 
29521 GCGCCGGGTC GATCCCCGCC AGTACGGAGG CGAGTTCGTC CCGGTCGGCG ACGTCGCAGG 
29581 CGATCGCCGT GACCTCGGCG CCGGGCACGT CGCTCGCCGT GCCGCTGCGC G AC AG CATC A 
29641 GCAGCCGGCG CACGCCGTGG CGTTCGACGA GGTGGCGGCT GATGATGCCG GCCAGCGTCC 
29701 CGGAGCCACC GGTGACGAGC ACGGTGCCGT CCGGGTCGAG CGCCGGAGCG TCACCCGCCG 
297 61 GGACCGCCGG GGCCAGACGG CGGGCGTACA CCTGGCCGTC ACGCAGCACC ACCTGGGGCT 
29821 CATCGAGCGC GGTGGCCGCT GCGAGCAGCG GCTCGGCGGT GTCCGGGGCG GCGTCGACGA 
29881 GGACGATCCG GCCGGGGTGT TCGGCCTGCG CGGTCCGCAC CAGTCCGGCG GCCGCGGCCG 
29941 ACGCGAGACC GGGCCCGGTG TGGACGGCCA GGACCGCGTC GGCGTACCGG TCGTCGGTGA 
30001 GGAAGCGCTG CACGGCGGTC AGGACGCCGG CGCCCAGTTC GCGGGTGTCG TCGAGCGGGG 
30061 CACCGCCGCC GCCGTGCGCG GGGAGGATCA CCACGTCCGG GACCGTCGGG TCGTCGAGGC 
30121 GGCCGGTCGT CGCGGTCGTG GGCGGCAGCT CCGGGAGCTC GGCCAGCACC GGGCGCAGCA 
30181 GGCCCGGAAC GGCTCCCGTG ATCGTCAGGG GGCGCCTGCG CACGGCGCCG ATGGTGGCGA 
30241 CGGGCCCGCC GGTCTCGTCC GCGAGGTGTA CGCCGTCAGC GGTGACGGCG ACGCGTACCG 
30301 CCGTGGCGCC GGTGGCG7GG ACGCGGACGT CGTCGAACGC GTACGGAAGG TGGTCCCCTT 
30361 CCGCGGCGAG GCGGAGTGCG GCGCCGAGCA GCGCCGGGTG CAGGCCGTAC CGTCCGGCGT 
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30421 CGGCGAGCTG TCCGTCGGCG AGGGCCACTT CCGCCCAGAC GGCGTCGTCG TCGGCCCAGA 
30481 CGGCGCGCGG GCGGGGCAGC GCGGGCCCGT CCGTGTACCC GGCTCGGGCC AGACGGTCGG 
30541 CGATGTCGTC GGGGTCCACC GGCCGGGCCG TGGCGGGCGG CCACGTCGAC GGCATCTCCC 
30601 GCACGGCCGG GGCCGTCCGC GGGTCGGGGG CGAGGATTCC GTGCGCGTGC TCGGTCCACT 
30661 CCCCCGCCGC GTGCCGCGTG TGCACGGTGA CCGCGCGGCG GCCGTCCGCC CCGGGCGCGC 
30721 TCACCGTGAC GGAGAGCGCG AGCGCACCGG ACCGCGGCAG CGTGAGGGGG GTGTCCACGG 
30781 TGAACGTGTC GAGGGCGCCG CAGCCGGCTT CGTCGCCCGC CCGGATCGCC AGATCCAGGA" 
30841 GGGCCGCGGC GGGCAGCACC GCGAGGCCGT GCAGGGAGTG CGCCAGCGGA TCGGCGGCGT 
30901 CGACCCGGCC GGTGAGCACC AGGTCGCCGG TGCCGGGCAG GGTGACCGCC GCGGTCAGCG 
30961 CCGGGTGCGC GACCGGCGTC TGTCCGGCCG GGGCCGCGTC GCCCGCGGTC TGGGTGCCGA 
31021 GCCAGTAGCG GACCCGCTCG AACGGGTACG TCGGCGGGTG CGAGGCGCGT GCCGGCGCGG 
31081 GGTCGATGAC CTTCGGCCAG TCGACCGTGA CGCCGTCGGT GTGCAGCCGG GCGAGCGCGG 
31141 TCAGGGCGGA TCGCGGTTCG TCGTCGGCGT GCAGCATCGG GATGCCGTCG ACGAGTCGGG 
31201 TCAGGCTCCG GTCCGGGCCG ATCTCCAGGA GCACCGCCCC GTCGTGCGCG GCGACCTGTT 
31261 CCCCGAACCG GACGGTGTCG CGGACCTGTC GTACCCAGTA CTCCGGCGTG GTGCAGGCGG 
31321 CGCCCGCGGC CATCGGGATC CTCGGCTCGT GGTACGTCAG GCTCTCCGCG ACCTTGCGGA 
31381 ACTCCTCGAG CATCGGCTCC ATCCGCGCCG AGTGGAACGC GTGGCTGGTC CGCAGGCGGG 
314 41 TGAAGCGGCC GAGCCGGGCC GCGACGTCGA GCACCGCCTC CTCGTCACCG GAGAGCACGA 
31501 TCGACGCGGG CCCGTTGACC GCGGCGATCT CCACGCCGTC CCGCAGCAGC GGCAGCGCGT 
31561 CCCGTTCCGA CGCGATCACG GCGGCCATCG CCCCGCCGGA CGGCAGCGCC TGCATCAGGC 
31621 GGGCCCGTGC GGACACCAGC CTGCACGCGT CCTCCAGGGA CCAGACGCCG GCGACGTACG 
31681 CGGCGGCCAG CTCGCCGATC GAATGGCCCA CGAAGGCGTC CGGGCGTACG CCCCACGCCT 
31741 CGAGCTGTGC GCCGAGTGCG ACCTGGAGCG CGAACACCGC GGGCTGGGCG TACCCGGTGT 
31801 CGTGGAGGTC GAGCCCGGCG GGCACGTCGA GGGCGTCCAG CACCTCGCGG CGAGTGCGGG 
31861 CGAAGACGTC GTAGGCGGCG GCCAGTCCGT CGCCCATGCC GGGACGTTGT GAGCCCTGTC 
31921 CGGAGAAGAG CCACACGAGG CGGCGGTCCG GTTCTGCGGC GCCGGTGACC GTGTCGGTGC 
31981 CGATCAGCGC GGCCCGGTGC GGGAAGGCCG TGCGGGCGAG CAGGGCCGCG GCCACCGCGC 
32041 GCTCGTCCTC CTCGCCGGTG GCGAGGTGGG CGCGCAGGCG GTGTACCTGT GCGTCGAGTG 
32101 CCTGCGGGGT GCGTGCCGAG AGCAGCAGGG GCAGCGGTCC GGTGTCGGGT GCCGGGGCGG 
32161 GTTCGGGGGC CGGTCGGGGG TGGCTTTCGA GGATGATGTG AGCGTTGGTG CCGCTAACGC 
32221 CGAAGGAGGA CACCCCGGCG CGCCGTGGGC GGTCGGTTTC GGGCCAGGGG CGGGCGTCGG 
32281 TGAGGAGTTC GACGGCGCCG GCCGTCCAGT CGACGTGCGA GGACGGCGTG TCCACGTGCA 
32341 GGGTGCGCGG CAGGGTGCCG TGCCGCATGG CGAGGACCAT CTTGATGACA CCGGCGACGC 
324 01 CCGCGGCGGC CTGAGTGTGG CCGATGTTGG ACTTCAGCGA GCCCAGCAGC ACCGGGGTGT 
324 61 CGCGATGCTG CCCGTAGGTG GCCAGTACCG CCTGCGCCTC GATGGGGTCG CCCAGCCTGG 
32521 TCCCGGTGCC ATGCGCCTCG ACAGCGTCCA CATCCGCCGG GGTGAGCCCG GCGTTGGCCA 
32581 GCGCCTGCCG GATCACCCGC TCCTGCGACG GCCCGTTCGG CGCCGACAAC CCGTTGGAAG 
32641 CACCGTCCTG GTTGACCGCC GAACCACGCA CGACCGCCAG GACATTGTGG CCGTGCCGCT 
32701 CGGCGTCGGA GAGCCTCTCG ACGATCAGCA CACCGGATCC CTCGGCGAAA CCGGTGCCAT 
32761 CAGCCGCATC CGCGAACGCC TTGCAGCGGC CGTCCGGGGA GAGGCCCCGC TGCTGGGAGA 
32821 AGTCCACGAA GCCGGACGGC GAGGCCATCA CCGTGACGCC GCCGACCACG GCGAGCGAGC 
32881 ACTCCCCCGA GCGCAGCGAC TGCCCGGCCT GGTGCAGCGC CACCAGCGAC GACGAACACG 
32941 CCGTGTCCAC CGTGACCGCC GGACCCTCCA AACCGTAGAA GTACGACAGC CGACCGGACA 
33001 GCACACTGGT CTGGGTGCTG GTGGCACCGA AACCGCCGCG GTCGGCTCCA GTGCCGTACC 
33061 CGTAGAAGTA GCCGCCCATG AACACGCCGG TGTCGCTTCC GCGCAGCGAC TCCGGGAGGA 
33121 TCCCGGCGTG TTCCAGCGCC TCCCACGAGG TCTCCAGGAC CAGACGCTGC TGCGGGTCCA 
33181 TCGCCAGCGC CTCACGCGGA CTGATCCCGA AGAACGCCGC GTCGAAGTCC GCCACCCCGG 
33241 CGAGGAAGCC ACCATGACGC ACGGTCGACG TGCCCGGATG ATCCGGATCG GGATCGTACA 
33301 GCCCGTCCAC GTCCCAACCA CGGTCCGTCG GAAACGCCGT GATCCCGTCA CCACCCGACT 
33361 CCAGCAGCCG CCACAAGTCC TCCGGCGACG CGACCCCACC CGGCAGCCGG CAGGCCATCC 
33421 CCACGATCGC CAACGGCTCG TCCTGCCGGA CGGCCGCGGT CGGGGTACGC CGCCGGGTGG 
334 81 TGGCCCGCGC GCCGGCCAGT TCGTCCAGGT GGGCGGCGAG CGCCTGCGCC GTGGGGTGGT 
33541 CGAAGACGAG CGTAGCGGGC AGCGTCAGGC CCGTCGCGTC GGCCAGCCGG TTGCGCAGTT 
33601 CGACGCCGGT CAGCGAGTCG AAGCCCACTT CCCTGAACGC GCGCGCGGGT GCGATGGCGT 



dc- 176500 



PATENT 

AttyDkt: 300622002600 



- 39 - 



33661 GGGCGTCGCG GTGGCCGAGC ACCGCGGCAG CGCTGGTACG GACGAGGTCG AGCATGTCGC 
33721 GCGCGGCCGG AGGTGCGGAC GTGCGCCGGA CGGCCGGCAC GAGGGTGCGT AGGACCGGCG 
33781 GGACCCGGTC GGACGCGGCG ACGGCGGCGA GGTCGAGCCG GATCGGCACG AGCGCGGGCC 
33841 GGTCGGTGTG CAGGGCCGCG TCGAACAGGG CGAGCCCCTG TGCGGCCGTC ATCGGGGTCA 
5 33901 TGCCGTTGCG GGCGATGCGG GCCAGGTCGG TGGCGGTCAG CCGCCCGCCC ATCCCGTCCG 
33961 CCGCGTCCCA CAGTCCCCAG GCGAGCGAGA CGGCGGGCAG CCCCTGGTGG TGCCGGTGGC 
34 021 GGGCGAGCGC GTCGAGGAAC GCGTTGCCGG TCGCGTAGTT GGCCTGACCC GCGCCGCCGA 
34081 ACGTGGCGGA TATGGACGAG TACAGGACGA ACGCGGCCAG GTCGAGATCG CGCGTCAGCT 
34141 CGTGCAGGTG CCAGGCGACG TCCGCCTTGA CCCGCAGCAC GGCGTCCCAC TGCTCCGGCC 
10 34 201 GCATGGTCGT CACGGCCGCG TCGTCGACGA TCCCGGCCAT GTGCACGACG GCGCGCAGCC 
34 2 61 GCTGGGCGAC GTCGGCGACG ACTGCGGCCA GCTCGTCGCG GTCGACGACG TCGGCGGCCA 
34321 CGTACCGCAC GCGGTCGTCC TCCGGCGTGT CGCCGGGCCG GCCGTTGCGG GACACCACGA 
34381 CGACCTCGGC GGCCTCGTGC ACGGTGAGCA GGTGGTCCAC GAGGAGGCGG CCGAGCCCGC 
34 4 41 CGGTGCCGCC GGTGACGAGG ACGGTCCCGC CGGTCAGCGG GGAGGTTCCG GTGGCCGCGG 
15 34 501 CGACACGGCG CAGACGGGCC GCACGCGCTG TGCCGTCGGC GACCCGGACG TGCGGCTCGT 
34 561 CGCCGGCGGC GAGCCCGGCC GCTATGGCGG CGGGCGTGAT CTCGTCCGCT TCGATCAGGG 
34 621 CGACGCGGCC GGGATGCTCC GTCTCCGCCG TCCGGACCAG GCCGCCGAGC GCTTCCTGCG 
34 681 CGGGATCGCC GGTACGGGTG GCCACGATGA GCCGGGATCG CGCCCAGCGC GGCTCGGCGA 
34 741 GCCAGGTCTG CACGGTGGTG AGCAGGTCGC GGCCCAGCTC CCGGGTCCGG GCGCCGGGCG 
20 34 801 AGGTGCCCGG GTCGCCGGGT TCCACGGCCA GGACCACGAC CGGGGGGTGC "TCGCCGTCGG 
34 8 61 GCACGTCGGC GAGGTACGTC CAGTCGGGGA CGGGTGACGC GGGCACGGGC ACCCAGGCGA 
34 921 TCTCGAACAG CGCCTCGGCA TCGGGGTCGG CGGCCCGCAC GGTCAGGCTG TCGACGTCAA 
34 981 GGACCGGTGA GCCGTGCTCG TCCGTGGCGA CGATGCGGAC CATGTCGGGG CCGACGCGTT 
35041 CCAGCAGCAC GCGCAGCGCG GTCGCGGCGC GCGCGTGG AT CCTCACGCCG GACCAGGAGA 
25 35101 ACGCCAGCCG GCGCCGCTCC GGGTCCGTGA AGACCGTCCC GAGGGCGTGC AGGGCCGCGT 
35161 CGAGCAGCAC GGGGTGCAGC CCGTACCGGG CGTCGGTGAG CTGTTCGGCG AGGCGGACCG 
35221 ACGCGTAGGC GCGGCCCTCC CCCGTCCACA TCGCGGTCAT GGCCCGGAAC GCGGGCCCGT 
35281 ACGAGAGCGG CAGCGCGTCG TAGAAGCCGG TCAGGTCGGC CGGGTCGGCG TCGGCGGGCG 
35341 GGCAGTCCAC GGGCTCCGCC GGACCGCCAG TGTCCACGCT CAGCGCTCCG GTCGCACTGA 
30 354 01 GCGCCCAGGG GCCCGTGCCG GTACGGCTGT GCAGACTCAC CGACCGCCGT CCGGACACCT 
354 61 CGGTTCCGAC GGTGGCCTGG ATCTCCGTGT CGCCGTCGCC GTCGACCACC ACCGGCGCGA 
35521 CGATGGTCAG CTCCGCGATC TCCGGCGTGC CGAGCCGGGC TCCCGCTTCG GCGAGCAGTT 
35581 CCACGAGCGC CGAGCCGGGC ACGATGACCC GGCCGTCCAC CTCGTGGTCG GCGAGCCAGG 
35641 GCTGACGGCG TACCGAGACA CCGCGGTGGC CAGCGCGCCC TCGCCGTCGG GCGAGGTCGA 
35 35701 CCCACGAGCC GAGCAGCGGG TGGCCGGACG TTCCCGCCGG TTCCGCGTCG ATCCAGTAGC 
357 61 GGTCACGGCG GAACGGGTAC GTGGGCAGCG GCACCACCCG ACGCGTCGCG AACGACCAGG 
35821 TGACGGGCAC GCCCCGGACC CAGAGCGCGG CGAGCGACCG AGTGAAGCGG TCCAGGCCGC 
35881 CCTCGCCTCG CCGCAGTGTG CCGGTGACGA CCGTATGCGC ATGCCCGGCG AGCGTGTCCT 
35941 CCAGTGCGGT GGTGAGCACG GGATGCGCGC TGACCTCGAC GAACGCGCGG TATCCGCGGT 
40 36001 CCGCCAGGTG GCCGGTCGCG GCGGCGAACC GAACGGTGCG GCGCAGGTTG TCGTACCAGT 
36061 AGGCGGCGTC CGCGGGCCGG TCCAGCCACG CCTCGTCCAC GGTGGAGAAG AACGGGACGT 
36121 CCGGCGTGCG CGGAGTGATG CCGGCGAGAG CGTCGAGCAG CGCGCCGCGG ATCGTTTCGA 
36181 CATGCGCGGT GTGCGACGCG TAGTCGACGG CGATCCGGCG GGCGCGGGGG GTGGCGGCCA 
36241 GCAGCTCCTC CACGGCGTCG GCCGCACCGG CGACAACGAT CGACGCGGGT CCGTTGACCG 
45 36301 CGGCGACCTC CAGGCGCCCG GCCCACACGG CGGCGTCGAA GTCGGCGGGC GGCACCGAGA 
3 6361 CCATGCCGCC CTGCCCGGCC AGTTCGGTGG CGACGAG TCG GCTGCGCACC GCGACGACCT 
36421 TCGCGGCGTC GTCCAGGGTG AGCACCCCGG CGACGCAGGC CGCGGCGACT TCGCCCTGGG 
36481 AGTGGCCGAC GACCGCGGCC GGGGCGACCC CGTGCGCACG CCACAGCTCC GCCAGCGCCA 
36541 CCATCACCGC GAACGACGCG GGCTGCACGA CATCGACCCG GTCGAACGCG GGCGCTCCGG 
36601 GCCGCTGGGC GATGACGTCC AGCAGGTCCC ATCCGGTGTG CGGGGCGAGC GCCGTGGCGC 
36661 ACTCGCGGAG CCGCCGGGCG AACACGGGCT CGGTGGCGAG CAGTTCGGCA CCCATGCCGG 
36721 CCCACTGGGA GCCCTGCCCG GGGAACGCGA ACACGACACG TGTGTCGGTG ACGTCGGCGG 
36781 TTCCCGTCAC GGCCCCCGGC ACTTCGGCAC CACGGGCGAA CGCCTCCGCC TCTCGGGCCG 
36841 GCACGACCGC CCGGTGGCGC ATGGCCGTCC GGGTGGTGGC GAGCGAGTGG CCGACCGCGG 
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36901 CCGCGGCGCC AGTGAGCGGG GCCAGCTGTC CCGCGACGTC CCGCAGTCCC TCCGGGGTCC 
36961 GGGCCGACAT CGGCCAGACC ACGTCCTCGG GCACCGGCTC GGCTTCGGGT GCGGACACGG 
37021 GTGCGGGCGC GGCGGGGGGC CCGGCCTCCA GGACGACATG GGCGTTGGTG CCGCTGATGC 
37081 CGAACGACGA GACACCCGCA CGCCGGGCGC GCCCGGTGAC CGGCChCGGC TCACTGCGGT 
37141 GCAGCAGCCG GATGTCGCCG TCCCAGTCGA CGTGCCGGGA CGGC7CGTCG ACGTGCAGCG 
37201 TGCGCGGCAG GACGCCGTGC CGCATCGCCA TGACCATCTT GATGACGCCG GCGACGCCGG 
372 61 CCGCGGCCTG GGTGTGGCCG ATGTTCGACT TGAGCGAGCC GATCAGCAGC GGATGCACGC 
37321 GTTCGCGCCC GTAGGCCACT TGCAGGGCCT GGGCCTCGAC GGGGTCGCCG AGACGGGTGC 
37381 CGGTGCCGTG TGCCTCCACG GCGTCGACGT CACCCGGCGC CAGGCCGGCG TCGGCGAGCG 
374 41 CACGCTGGAT GACGCGCTGC TGCGCAGGCC CGTTCGGGGC GGACAGCCCG TTCGACGCGC 
37 501 CGTCGGAGTT GACCGCGGAG CCGCGCACCA GCGCCAGCAC GGGGTGGCCG TGGCGGGTGG 
37561 CGTCGGAGAG CCGCTCCAGC ACCAGGACAC CGGCGCCCTC GGCGAAGCTC GTGCCGTCCG 
37 621 CGGTGTCCGC GAAGGCCTTG GCACGGCCGT CGGGGGCGAG CCCGCGCTGC CGGGAGAACT 
37 681 CGACGAACCC GGTCGTCGTC GCCATCACCG TGACACCGCC GACCAGGGCG AGCGAGCACT 
377 41 CCCCCGAGCG CAGCGACCGC GCGGCCTGGT GCAGCGCCAC CAGCGACGAC GAACACGCCG 
37801 TGTCGACGGT GACCGACGGG CCCTCCAGAC CGAAGTAGTA CGAGAGCCGC CCGGAGAGAA 
37 8 61 CGCTGGTCGG CGTGCCGGTC GCCCCGAAAC CGCCCAGGTC CACGCCCGCG CCGTAGCCCT 
37 921 GGGTGAACGC GCCCATGAAT ACGCCGGTGT CGCTGCCGCG GACGCTTTCG GGCAGGATGC 

37 981 CCGCTCGTTC GAACGCCTCC CACGACGCTT CGAGGACCAG ACGCTGCTGC GGGTCCATCG 
38041 CCAGCGCCTC ACGCGGGCTG ATCCCGAAGA ACGCGGCGTC GAAGTCGGCG GCGCCGGTGA 
38101 GGAAGCCGCC GTGACGCACG GAAACCTTGC CGACCGCGTC GGGGTTCGGG TCGTAGAGCG 
38161 CGGCGAGGTC CCAGCCGCGG TCGGCGGGGA ACTCGGTGAT CGCGTCCCCG CCGGAGTCGA 
38221 CCAGCCGCCA CAGGTCCTCC GGTGACCGCA CGCCACCGGG CATCCGGCAC GCCATGGCCA 
382 81 CGATCGCCAG CGGCTCGTTC CCCGCCACCG TCGGTGCGGG CACTGTCGCC GCCGGAGCGG 
38341 CAGGGGCCGG CTCACCCCGC CGTTCCTCAT CCAGGCGGGC GGCGAGCGCG GCCGGTGTCG 
384 01 GGTGGTCGAA GACGGCCGTC GCGGAGAGCC GTACCCCCGT CGTCTCGGCG AGGCTGTTGC 
384 61 GCAACCGGAC ACCGCTGAGC GAGTCGATGC CGAGGTCCTT GAACGCCGTC GTGGGCGTGA 
38521 TCTCGGAGGC GTCGGCGTGG CCGAGCACGG CGGCCGTGGC CGCACACACG ATGGCCAGCA 
38581 GGTCACGATC GCGGTCGCGG TCGCGGTCGC GGTTGTCCTC CGCACGGGCG GCGATGCGGC 

38 641 GCTCGGTCCG CTGCCGGACG GGCTCGGTGG GAATCGCCGC GACCATGAAC GGCACGTCCG 
387 01 CGGCGAGGCT CGCGTCGATG AAGTGGGTGC CCTCGGCCTC GGTGAGCGGC CGGAACCCGT 
387 61 CGCGCACCCG CTGCCGGTCG GCGTCGTCAA GTTGTCCGGT GAGGGTGCTG GTGGTGTGCC 
38821 ACATGCCCCA GGCGATGGAG GTGGCGGGTT GGCCGAGGGT GTGGCGGTGG GTGGCGAGGG 
38 881 CGTCGAGGAA GGCGTTGGCG GCGGCGTAGT TTCCTTGTCC GGGGCTGCCG AGGACGGCGG 
38 941 CGGCGCTGGA GTAGAGGACG AAGTGGGTGA GGGGTTGGTT TTGGGTGAGG TGGTGCAGGT 
39001 GCCAGGCGGC GTTGGCTTTG GGGTGGAGGA CGGTGGTGAG GCGGTCGGGG GTGAGGGCGT 
39061 CGAGGATGCC GTCGTCGAGG G'TGGCGGCGG TGTGGAAGAC GGCGGTGAGG GGTTGGGGGA 
39121 TGTGGGCGAG GGTGGTGGCG AGTTGGTGGG GGTCGCCGAC GTCGCAGGGG AGGTGGGTGC 
39181 CGGGGGTGGT GTCGGGGGGT GGGGTGCGGG AGAGGAGGTA GGTGTGGGGG TGGTTCAGGT 
39241 GGCGGGCGAG GATGCCGGCG AGGGTGCCGG AGCCGCCGGT GATGATGATG GCGTGTTCGG 
39301 GGTTGAGGGG GGTGGTGGTG GGTGGGGTGG TGGTGTGGAG GGGGGTGAGG TGGGGTCGGT 
39361 GGAGGGTGTG GTGGGTGAGG CGGAGGTGGG GGTGGTCGAG GGTGGCGAGT TGGGCCAGGG 
394 21 GGAGGGGAGT GTGGGGGTGG TCGGTTTCGA TGAGGCGGAT GCGGTGGGGG TGTTCGTTCT 
39481 GGGCGGTGCG GGTGAGGCCG GTGACGGTGG CGCCGGCGGG GTCGGTGGTG GTGTGGACGA 
39541 TGAGGGTGTG GTCGGTGGTG GTGAGGTGGT GTTGCAGGGC GGTCAGGACG CGGGTGGCGC 
39601 GGGTGTGGGC GCGGGTGGGT ATGTCCTCGG GGTCGTCGGG GTGGGCGGCG GTGATCAGGA 
39661 CGTGTCCCTC GGGCAGGTCA CCGTCGTAGA CCGCCTCGGC GACCGCGAGC CACTCCAACC 
39721 GGAGCGGGTT CGGCCCCGAC GGGGTGTCGG CCCGCTCCCT CAGCACCAGC GAGTCCACCG 
39781 ACACGACAGG ACGGCCATCC GGGTCGGCCA CGCGCACGGC GACGCCGGCC TCCCCCCGGG 
39841 TGAGGGCGAC GCGCACCGCG GCGGCCCCGG TGGCGTTCAG GCGCACGCCC GTCCAGGAGA 
39901 ACGGCAGCTC GATCCCGCCG CCCGCGTCGA GGCGCCCGGC GTGCAGGGCC GCGTCGAGCA 
39961 GTGCCGGATG CACACCGAAA CCGTCCGCCT CGGCGGCCTG CTCGTCGGGC AGCGCCACCT 
4 0021 CGGCATACAC GGTGTCACCA TCACGCCAGG CAGCCCGCAA CCCCTGGAAC GCCGACCCGT 
4 0081 ACTCATAACC GGCATCCCGC AGTTCGTCAT AGAACCCCGA GACGTCGACG GCCGCGGCCG 
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4 0141 TGGCCGGCGG CCACTGCGAG AACGGCTCAC CGGAAGCGTT GGAGGTATCC GGGGTGTCGG 
40201 GGGTCAGGGT GCCGCTGGCG TGCCGGGTCC AGCTGCCCGT GCCCTCGGTA CGCGCGTGGA 
40261 CGGTCACCGG CCGCCGTCCG GCCTCATCGG CCCCTTCCAC GGTCACCGAC ACATCCACCG 
4 0321 CTGCGGTCAC CGGCACCACG AGCGGGGATT CGATGACCAG TTCATCCACC ACCCCGCAAC 
4 0381 CGGTCTCGTC ACCGGCCCGG ATGACCAGCT CCACAAACGC CGTACCCGGC AGCAGAACCG 
40441 TGCCCCGCAC CGCGTGATCA GCCAGCCAGG GATGCGTACG CAATGAGATC CGGCCGGTGA 
40501 GAACAACACC ACCACCGTCG TCGGCGGGCA GTGCTGTGAC GGCGGCCAGC ATCGGATGCG 
4 0561 CCGCCCCGGT CAGCCCGGCC GCGGACAGGT CGGTGGCACC GGCCGCCTCC AGCCAGTACC 
4 0621 GCCTGTGCTC GAACGCGTAG GTGGGCAGAT CCAGCAGCCG CCCCGGCACC GGTTCGACCA 
4 0681 CCGTGCCCCA GTCCACCCCC GCACCCAGAG TCCACGCCTG CGCCAACGCC CCCAGCCACC 
40741 GCTCCCAGCC ACCGTCACCA GTCCGCAACG ACGCCACCGT GCGGGCCTGT TCCATCGCCG 
4 0801 GCAGCAGCAC CGGATGGGCA CTGCACTCCA CGAACACCGA CCCGTCCAGC TCCGCCACCG 
40861 CCGCATCCAG CGCGACAGGG CGACGCAGGT TCCGGTACCA GTACCCCTCA TCCACCGGCT 
4 0921 CGGTCACCCA GGCGCTGTCC ACGGTCGACC ACCACGCCAC CGACCCGGTC CCGCCGGAAA 
4 0981 TTCCCTTCAG TACCTCAGCG AGTTCGTCCT CGATGGCCTC CACGTGAGGC GTGTGGGAGG 
41041 CGTAGTCGAC CGCGATACGA CGCACCCGCA CCCCATCAGC CTCATACCGC GCCACCACCT 
41101 CCTCCACCGC CGACGGGTCC CCCGCCACCA CCGTCGAAGC CGGACCATTA CGCGCCGCGA 
41161 TCCACACACC CTCGACCAGA CCCACCTCAC CGGCCGGCAA CGCCACCGAA GCCATCGCCC 
41221 CCCGGCCGGC CAGCCGCGCC GCGATCACCC GACTGCGCAA CGCCACCACG CGGGCGGCGT 
41281 CCTCCAGGCT GAGGGCTCCG GCCACACACG CCGCCGCGAT CTCCCCCTGC GAGTGTCCGA 
41341 CCACAGCGTC CGGCACGACC CCATGCGCCT GCCACAGCGC GGCCAGGCTC ACCGCGACCG 
414 01 CCCAGCTGGC CGGCTGGACC ACCTCCACCC GCTCCGCCAC ATCCGACCGC GACAACATCT 
414 61 CCCGCACATC CCAGCCCGTG TGCGGCAACA ACGCCCGCGC ACACTCCTCC ATACGAGCCG 
41521 CGAACACCGC GGAACGGTCC AT GAG TT CCA CGCCCATGCC CACCCACTGG GCACCCTGCC 
41581 CGGGGAAGAC GAACACCGTA CGCGGCTGAT CCACCGCCAC ACCCATCACC CGGGCATCAC 
41641 CCAGCAGCAC CGCACGGTGA CCGAAGACAG CACGCTCACG CACCAACCCC TGCGCGACCG 
41701, CGGCCACATC CACCCCACCC CCGCGCAGAT ACCCCTCCAG CCGCTCCACC TGCCCCCGCA 
41761 GACTCACCTC ACCACGAGCC GACACCGGCA ACGGCACCAA CCCATCACCA CCCGACTCCA 
41821 CACGCGACGG CCCAGGAACA CCCTCCAGGA TCACGTGCGC GTTCGTACCG CTCACCCCGA 
41881 ACGACGACAC ACCCGCATGC GGTGCCCGAT CCGACTCGGG CCACGGCCTC GCCTCGGTGA 
41941 GCAGCTCCAC CGCACCGGCC GACCAGTCCA CATGCGACGA CGGCTCGTCC ACGTGCAGCG 
42001 TCTTCGGCGC GATCCCATGC CGCATCGCCA TGACCATCTT GATGACACCG GCGACACCCG 
42061 CAGCCGCCTG CGCATGACCG ATGTTCGACT TGACCGAACC GAGGTAGAGC GGCGTGTCGC 
42121 GGTCCTGCCC GTAGGCCGCG AGGACGGCCT GCGCCTCGAT CGGGTCGCCC AGCCGCGTGC 
42181 CGGTGCCGTG CGCCTCCACC ACC-TCCACAT CGGCGGCGCG CAGTCCGGCG TTGACCAACG 
42241 CCTGCCGGAT CACGCGCTGC TGGGCGACGC CGTTGGGGGC GGACAGTCCG TTGGAGGCAC 
42301 CGTCCTGGTT CACCGCCGAG CCGCGGACGA CCGCGAGAAC GGTGTGCCCG TTGCGCTCGG 
42361 CGTCGGAGAG CCGCTCCAGC ACGAGAACGC CGACGCCCTC GGCGAAGCCG GTCCCGTCCG 
424 21 CCGCGTCGGC GAACGCCTTG CACCGTCCGT CCGGGGAGAG TCCGCGCTGC CGGGAGAACT 
424 81 CCACGAGCTC TGCGGTGTTC GCCATGACGG TGACACCGCC GACCAGCGCC AGGGAGCACT 
42541 CCCCGGCCCG CAGTGCCTGT GCCGCCTGGT GCAGGGCGAC CAGCGACGAC GAGCACGCCG 
42 601 TGTCGACCGT GACCGCCGGG CCCTGAAGTC CGTACACGTA CGAGAGGCGC CCGGACAGGA 
42661 CGCTCGTCTG CGTCGCCGTG ACACCGAGCC CGCCCAGGTC CCGGCCGACG CCGTAGCCCT 
42721 GGTTGAACGC GCCCATGAAC ACGCCGGTGT CGCTCTCCCG GAGCCTGTCC GGCACGATGC 
4 2781 CGGCGTTCTC GAACGCCTCC CAGGAGGTCT CCAGGATCAG GCGCTGCTGG GGGTCCATCG 
CCAGCGCCTC GTTCGGACTG ATGCCGAAGA ACGCGGCGTC GAACCCGGCG CCGGCCAGGA 
42901 ATCCGCCGTG GCGTGTCGTG GAGCGGCCGG CCGCGTCCGG GTCCGGGTCG TACAGCGCGT 
42961 CGACGTCCCA GCCCCGGTCG GTGGGGAACT CGGTGATCGC CTCGGTACCG GCGGCGACGA 
4 3021 GCCGCCACAG GTCCTCCGGC GAGGCGACCC CGCCGGGCAG TCGGCACGCC ATGCCGACGA 
4 3081 TCGCGACGGG GTCGCCGGAG CCGAGGGTCT GGGCGGTCGC GGGTGCCGCT GTCGCGGAGC 
43141 CGGCGAGGTG GGCGGCGAAC GCACGCGGAG TGGGGTGGTC GAACGCGGTT GACGCGGGCA 
4 3201 CCCGCAGACC CGTCCGCGCG GCGACGGTGT TGGTGAACTC GACGGTGGTG AGCGAGTCGA 
4 32 61 GGCCGTTCTC GCGGAACGTG CGGTCCGGGG AGCAGTGTCC GGCGCCCGGC AGGCCCAGGA 
4 3321 CGGTGGCGAC GCTGTCGCGG ACCAGGTCGA GCAGTACGTC CTCCCGGCCC GCACGGGCCG 
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43381 CGGCGAGGCG GTTCGCCCAC TCCTGTTCCG TGGCGTCGGG CTCGGCCGGT CCGGTCAGTG 
43441 CGGTGAGGAT CGGCGGCGTG GCGCCCGCCA TCGTCGCGGC CCGCGCCCCG GCGGAACCGG 
43501 TCCGGGCCAC GATGTACGAG CCGCCGCCCG CGATGGCCTT CTCGATCAGG TCGCCGGTGA 
4 3561 GCGCCGGCCG TTCGATGCCG GGCAGCGCGC GGACGGTGAC GGTGGGGAGT CCCTCCGCGG 
4 3621 CCCGTGGCCG GGTGTGGGCG TCGGCGCCGG CCGGGCCGTC GAGCAGGACG TGCACGAGCG 
43681 CGCCGGGGTT CGCGGCTTCC TCGGCTGCGG TGGTCACGTG GGTGAGGCCG GTCTCGTCGC 
4 3741 GGAGCAGGCC GGCGACGGTG TCGGCGTCCT CCCCGGTGAC CAGGACCGGC GCGTCCGGGC 
43801 CGATCGGAGG CGGCACGGTG AGGACCATCT TGCCGGTGTG CCGGGCGTGG CTCATCCACG 
4 38 61 CGAACGCGTC CCGCGCACGG CGGATGTCCC ACGGCTGCAC CGGCAGCGGG CACAGCTCAC 
43921 CGCGGTCGAA CAGGTCGAGG AGCAGTTCGA GGATCTCCCG CAGGCGCGCG GGATCCACGT 
43981 CGGCCAGGTC GAACGGCTGC TGGGCGGCGT GGCGGATGTC GGTCTTGCCC ATCTCGACGA 
4 4 041 ACCGGCCGCC CGGTGCGAGC AGGCCGATGG ACGCGTCGAG GAGTTCACCG GTGAGCGAGT 
44101 TGAGCACGAC GTCGACCGGC GGGAAGGTGT CGGCGAACGC GGCGCTGCGG GAGTTCGCCA 
44161 CATGGTCGGT GTCGAAGCCG TCGGCGTGCA GCAGGTGTTG TTTGGCGGGA CTGGCGGTGG 
4 4 221 CGTACACCTC GGCGCCGAGG TGGCGGGCGA TCCGGGTCGC CGCCATGCCG ACACCGCCCG 
44281 TCGCGGCGTG GACCAGGACC TTCTGGCCGG GTCGCAGCTC GCCCGCGTCG ACGAGGCCGT 
4 4341 ACCAGGCGGT GGCGAACACG ATGGGCACGG ACGCGGCGAT GGGGAACGAC CATCCCCGTG 
4 4401 GGATCCGTGC GACCAGCCGC CGGTCCGCGA CCACGCTGCG CCGGAACGCG TCCTGCACGA 
4 4461 GACCGAACAC GCGGTCGCCG GGGGCCAGGT CGTCGACGCC GGGTCCGACT TCGGTCACGA 
4 4521 TGCCCGCGGC CTCCCCGCCC ATCTCGCCCT CGCCCGGGTA GGTGCCGAGC GCGATCAGCA 
44 581 CGTCGCGGAA GTTCAGCCCC GCGGCGCGGA CGTCGATGCG GACCTCGCCG GCGGCCAGGG 
44 641 GCGCGGCGGG ACGTCGAGCG GGGCGACGAC GAGGTCGCGG AGCGTTCCGG AGGCGGGCGG 
4 4701 GCGCAGCGCC CACTGGCGCG GTCGGCAGGG GGGTGGTGTC CGCGCGTACC AGCCGGGGCA 
4 47 61 CGTAGGCCAC GCCGGCCCGC AGCGCGATCT GGGGTTCGCC GAGCGAGGCC GCGGCGGGGA 
4 4821 CGAGGTCGTC ATCGCCGTCC GTGTCCACCA GCACGAACGA TCCGGGTTCG GCGGCCTGGC 
44 881 GGCGCAGCGC CTCGTCCCAG AGCCGGGCCT GGTCCGCGTC CGGGATCTCG GCCGGGCCGA 
44 941 CGCCCACCGC GCGGCGGGTG ACGACCGTCC GGCGGGGTGA CGGGGTGCCG GGCAGGTCGC 
45001 GCCGCTCCCA GACCAGTTCG CACAGCGTGG CCTCGCCACT GCCGGTGGCG ACCAGATGGG 
45061 CCGGCAGCCC CGCGAGCCGC GCGCGCTGGA CCTTGCCCGA CGCGGTGCGG GGGATCGTGG 
4 5121 TGACGTGCCA GATCTCGTCG GGCACCTTGA AGTAGGCGAG CCGGCGGCGG CACTCGGCGA 
45181 GGATCGCCTC GGCGGGGACG CGGGGGCCGT CGGAAACGAC GTAGAGCACG GGTATGTCGC 
4 5241 CGAGGACGGG GTGCGGGCGG CCCGCCGCGG CGGCGTCCCG GACACCGGCC ACCTCCTGGG 
45301 CGACGGTCTC GATCTCCCGG GGGTGGATGT TCTCCCCGCC GCGGATGATC AGCTCCTTGA 
45361 CCCGGCCGGT GATCGTCACG TGTCCGGTCT CGGCCTGACG TGCGAGGTCC CCGGTGCGGT 
45421 ACCAGCCGTC CACGAGCACC TGGGCGGTCG CCTCCGGCTG GGCGTGGTAG CCGAGCATGA 
4 5481 GGCTCGGCCC GCTCGCCCAC AGCTCGCCCT CCTCGCCGGG TGCCACGTCG GCGCCGGACA 
4 5541 CCGGGTCGAC GAACCGCAGC GACAGGCCCG GCACGGGCAG CCCGCACGAG CCGGGAACCC 
45601 GCGCATCCTC CAGGGTGTTG GCGGTGAGCG AGCCGGTCGT CTCGGTGCAG CCGTACGTGT 
45661 CGAGCAGGGG CACGCCGAAC GTCGCCTCGA AATCCCTGGT GAGCGACGCC GGCGAGGTGG 
4 5721 ATCCGGCGAC CAGCGCCACG CGCAGCGCGC GAGCCCGCGG CTCGCCGGAC ACGGCGCCGA 
4 5781 GGAGGTAGCG GTACATCGTC GGCACGCCGA CGAGCACGGT GCTGGAGTGT TCGGCCAGGG 
4 5841 CGTCGAGGAC GTCACGCGCG ACGAAGCCGC CCAGGATACG GGCGGACGCG CCGACCGTGA 
4 5901 GGACGGCGAG CAGGCAGAGG TGGTGGCCGA GGCTGTGGAA CAGCGGGGCG GGCCAGAGCA 
45961 GTTCGTCGTC CTCGGTCAGC CGCCAGGACG GCACGTCGCA GTGCATCGCG GACCACAGGC 
4 6021 CGCTGCGCTG TGCGGAAACC ACGCCCTTGG GACGGCCGGT GGTGCCGGAG GTGTAGAGCA 
4 6081 TCCAGGCGGG TTCGTCCAGG CCGAGGTCGT CGCGGGGCGG GCACGGCGGC TCGGTCCCGG 
4 6141 CGAGGTCCTC GTAGGAGACG CAGTCCGGTG CCCGGCGCCC GACGAGCACG ACGGTGGCGT 
4 6201 CGGTGCCGGT GCGGCGCACC TGGTCGAGGT GGGTTTCGTC GGTGACCAGC ACGGTCGCGC 
4 62 61 CGGAGTCCGT CAGGAAGTGG GCGAGTTCGG CGTCGGCGGC GTCCGGGTTG AGCGGGACGG 
4 6321 CGACGGCGGC GGCGCGGGCG GCGGCGAGGT AGACCTCGAT GGTCTCGATC CGGTTGCCGA 
4 6381 GCAGCATCGC GACCCGGTCG CCGCGGTCGA CGCCGGACGC GGCGAGGTGT CCGGCGAGCC 
4 6441 GGCCGGCCCG GAGCCGGAGT TGCGTGTACG TCACGGCGCG TTGGGAATCC GTGTAGGCGA 
4 6501 TCCGGTCGCC GCGTCGCTCG GCATGGATGC GGAGCAATTC GTGCAACGGC CGGATTGGTT 
4 6561 CCACACGCGC CATGGAAACA CCTTTCTCTC GACCAACCGC ACAACAGCAC GGAACCGGCC 
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4 6621 ACGAGTAGAC GCCGGCGACG CTAGCAGCGT TTTCCGGACC GCCACCCCCT GAAGATCCCC 
4 6681 CTACCGTGGC CGGCCTCCCC GGACGCTCAT CTAGGGGGTT GCACGCATAC CGCCGTGCGT 
4 6741 AATTGCCTTC CTGATGACCG ATGCCGGACG CCAGGGAAGG GTGGAGGCGT TGTCCATATC 
4 6801 TGTCACGGCG CCGTATTGCC GCTTCGAGAA GACCGGATCA CCGGACCTCG AGGGTGACGA 
4 6861 GACGGTGCTC GGCCTGATCG AGCACGGCAC CGGCCACACC GACGTGTCGC TGGTGGACGG 
4 6921 TGCTCCCCGG ACCGCCGTGC ACACCACGAC CCGTGACGAC GAGGCGTTCA CCGAGGTCTG 
4 6981 GCACGCACAG CGCCCTGTCG AGTCCGGCAT GGACAACGGC ATCGCCTGGG CCCGCACCGA 
4 7 041 CGCGTACCTG TTCGGTGTCG TGCGCACCGG CGAGAGCGGC AGGTACGCCG ATGCCACCGC 
4 7101 GGCCCTCTAC ACGAACGTCT TCCAGCTCAC CCGGTCGCTG GGGTATCCCC TGCTCGCCCG 
4 7161 GACCTGGAAC TACGTCAGCG GTATCAACAC GACGAACGCG GACGGGCTGG AGGTGTACCG 
4 7221 GGACTTCTGC GTGGGCCGCG CCCAGGCGCT CGACGAGGGC GGGATCGACC CGGCCACCAT 
4 7281 GCCCGCGGCC ACCGGTATCG GCGCCCACGG GGGCGGCATC ACCTGCGTGT TCCTCGCCGC 
47341 CCGGGGCGGA GTGCGGATCA ACATCGAGAA CCCCGCCGTC CTCACGGCCC ACCACTACCC 
4 7401 GACGACGTAC GGTCCGCGGC CCCCGGTCTT CGCACGGGCC ACCTGGCTGG GCCCGCCGGA 
4 74 61 GGGGGGCCGG CTGTTCATCT CCGCGACGGC CGGCATCCTC GGACACCGAA CGGTGCACCA 
4 7521 CGGTGATGTG ACCGGCCAGT GCGAGGTCGC CCTCGACAAC ATGGCCCGGG TCATCGGCGC 
4 7581 GGAGAACCTG CGGCGCCACG GCGTCCAGCG GGGGCACGTC CTCGCCGACG TGGACCACCT 
4 7 641 CAAGGTCTAC GTCCGCCGCC GCGAGGATCT CGATACGGTC CGCCGGGTCT GCGCCGCACG 
4 7701 CCTGTCGAGC ACCGCGGCCG TCGCCCTTTT GCACACCGAC ATAGCCCGCG AGGATCTGCT 
4 7761 CGTCGAAATC GAAGGCATGG TGGCGTGACA ATACCCGGTA AAAGGCCCGC GACGCTGCGC 
4 7821 CTCGGCGGAT CCGCGAAGAG AAAGAAGAGC GTCACCGCAC AGCGCGGCAG CCCGGTCCTT 
4 7881 TCGTCCTTCG CACAGCGGCG GATCTGGTTT CTCCAGCAAT TGGACCCGGA GAGCAACGCC 
4 7 941 TATAATCTCC CGCTCGTGCA ACGCCTGCGC GGTCTATTGG ACGCGCCGGC CCTGGAGCGT 
4 8001 GCGCTGGCGC TCGTCGTCGC GCGCCACGAG GCGTTGCGGA CGGTGTTCGA CACCGCCGAC 
4 8061 GGCGAGCCCC TCCAGCGGGT GCTTCCCGCC CCGGAACACC TCCTGCGCCA CGCGCGGGCG 
4 8121 GGCAGCGAGG AGGACGCCGC CCGGCTCGTC CGCGACGAGA TCGCCGCGCC GTTCGACCTC 
4 8181 GCCACCGGGC CGTTGATCAG GGCCCTGCTG ATCCGCCTCG GTGACGACGA CCACGTTCTC 
4 8241 GCGGTGACCG TGCACCATGT CGCCGGCGAC GGCTGGTCGT TCGGGCTCCT CCAACATGAA 
4 8301 CTCGCAGCCC ACTACACGGC GCTGCGCGAC ACTGCCCGCC CTGCCGAACT GCCGCCGTTG 
4 8361 CCGGTGCAGT ACGCCGACTT CGCCGCCTGG GAGCGGCGCG AACTCACCGG CGCCGGACTG 
4 8421 GACAGGCGTC TGGCCTACTG GCGCGAGCAA CTCCGGGGCG CCCCGGCGCG GCTCGCCCTC 
4 8481 CCCACCGACC GTCCCCGCCC GCCGGTCGCC GACGCGGACG CGGGCATGGC CGAGTGGCGG 
4 8541 CCGCCGGCCG CGCTGGCCAC CGCGGTCCTC ACGCTCGCGC GCGACTCCGG TGCGTCCGTG 
4 8601 TTCATGACCC TGCTGGCGGC CTTCCAAGCG GTCCTCGCCC GGCAGGCGGG CACGCGGGAC 
48661 GTGCTGGTCG GCACGCCCGT GGCGAACCGT ACGCGGGCGG CGTACGAGGG CCTGATCGGC 
4 8721 ATGTTCGTCA ACACGCTCGC GCTGCGCGGC GACCTCTCGG GCGATCCGTC GTTCCGGGAA 
4 8781 CTCCTCGACC GCTGCCGGGC CACGACCACG GACGCGTTCG CCCACGCCGA CCTGCCGTTC 
48841 GAGAACGTCA TCGAACTCGT CGCACCGGAA CGCGACCTGT CGGTCAACCC GGTCGTCCAG 
48 901 GTGCTGTTGC AGGTGCTGCG GCGCGACGCG GCGACGGCCG CGCTGCCCGG CATCGCGGCC 
4 8 961 GAACCGTTCC GCACCGGACG CTGGTTCACC CGCTTCGACC TCGAATTCCA TGTGTACGAG 
4 9021 GAGCCGGGTG GCGCGCTGAC CGGCGAACTG CTCTACAGCC GTGCGCTGTT CGACGAGCCA 
4 9081 CGGATCACGG GGTTGCTGGA GGAGTTCACG GCGGTGCTTC AGGCGGTCAC CGCCGACCCG 
4 9141 GACGTACGGC TGTCGCGGCT GCCGGCCGGC GACGCGACGG CGGCAGCGCC CGTGGTGCCC 
4 9201 TCGAACGACA CGGCGCGGGA CCTGCCCGTC GAC AC GCTGC CGGGCCTGCT GGCCCGGTAC 
4 92 61 GCCGCACGCA CCCCCGGCGC CGTGGCCGTC ACCGACCCGC ACATCTCCCT CACCTACGCG 
4 9321 CAGCTGGACC GGCGGGCGAA CCGCCTCGCG CACCTGCTCC GCGCGCGCGG CACCGCCACC 
4 9381 GGCGACCTGG TCGGGATCTG CGCCGATCGC GGCGCCGACC TGATCGTCGG CATCGTGGGG 
4 9441 ATCCTCAAGG CGGGCGCCGC TTATGTGCCG CTGGACCCCG AACATCCTCC GGAGCGCACG 
4 9501 GCGTTCGTGC TGGCCGACGC GCAGCTGACC ACGGTGGTGG CGCACGAGGT CTACCGTTCC 
4 9561 CGGTTCCCCG ATGTGCCGCA CGTGGTGGCG TTGGACGACC CGGAGCTGGA CCGGCAGCCG 
4 9621 GACGACACGG CGCCGGACGT CGAGCTGGAC CGGGACAGCC TCGCCTACGC GATCTACACG 
4 9681 TCCGGGTCGA CCGGCAGGCC GAAGGCCGTG CTCATGCCGG GTGTCAGCGC CGTCAACCTG 
4 9741 CTGCTCTGGC AGGAGCGCAC GATGGGCCGC GAGCCGGCCA GCCGCACCGT CCAGTTCGTG 
4 9801 ACGCCCACGT TCGACTACTC GGTGCAGGAG ATCTTTTCCG CGCTGCTGGG CGGCACGCTC 
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4 98 61 GTCATCCCGC CGGACGAGGT GCGGTTCGAC CCGCCGGGAC TCGCCCGGTG GATGGACGAA 
4 9921 CAGGCGATTA CCCGGATCTA CGCGCCGACG GCCGTACTGC GCGCGCTGAT CGAGCACGTC 
4 9981 GATCCGCACA GCGACCAGCT CGCCGCCCTG CGGCACCTGT GCCAGGGCGG CGAGGCGCTG 
50041 ATCCTCGACG CGCGGTTGCG CGAGCTGTGC CGGCACCGGC CCCACCTGCG CGTGCACAAT 
50101 CACTACGGTC CGGCCGAAAG CCAGCTCATC ACCGGGTACA CGCTGCCCGC CGACCCCGAC 
50161 GCGTGGCCCG CCACCGCACC GATCGGCCCG CCGATCGACA ACACCCGCAT CCATCTGCTC 
50221 GACGAGGCGA TGCGGCCGGT TCCGGACGGT ATGCCGGGGC AGCTCTGCGT CGCCGGCGTC 
50281 GGCCTCGCCC GTGGGTACCT GGCCCGTCCC GAGCTGACCG CCGAGCGCTG GGTGCCGGGA 
50341 GATGCGGTCG GCGAGGAGCG CATGTACCTC ACCGGCGACC TGGCCCGCCG CGCGCCCGAC 
504 01 GGCGACCTGG AATTCCTCGG CCGGATCGAC GACCAGGTCA AGATCCGCGG CATCCGCGTC 
504 61 GAACCGGGTG AGATCGAGAG CCTGCTCGCC GAGGACGCCC GCGTCACGCA GGCGGCGGTG 
50521 TCCGTGCGCG AGGACCGGCG GGGCGAGAAG TTCCTGGCCG CGTACGTCGT ACCGGTGGCC 
50581 GGCCGGCACG GCGACGACTT CGCCGCGTCG CTGCGCGCGG GACTGGCCGC CCGGCTGCCC 
50 641 GCCGCGCTCG TGCCCTCCGC CGTCGTCCTG GTGGAGCGAC TGCCGAGGAC CACGAGCGGC 
50701 AAGGTGGACC GGCGCGCGCT GCCCGACCCG GAGCCGGGCC CGGCGTCGAC CGGGGCGGTT 
507 61 ACGCCCCGCA CCGATGCCGA GCGGACGGTG TGCCGGATCT TCCAGGAGGT GCTCGACGTC 
50821 CCGCGGGTCG GTGCCGACGA CGACTTCTTC ACGCTCGGCG GGCACTCCCT GCTCGCCACC 
50881 CGGGTCGTCT CCCGCATCCG CGCCGAGCTG GGTGCCGATG TCCCGCTGCG TACGCTCTTC 
50941 GACGGGCGGA CGCCCGCCGC GCTCGCCCGT GCGGCGGACG AGGCCGGCCC GGCCGCCCTG 
51001 CCCCCGATCG CGCCCTCCGC GGAGAACGGG CCGGCCCCCC TCACCGCGGC ACAGGAACAG 
51061 ATGCTGCACT CGCACGGCTC GCTGCTCGCC GCGCCCTCCT ACACGGTCGC CCCGTACGGG 
51121 TTCCGGCTGC GCGGGCCACT CGACCGCGAA GCGCTCGACG CGGCACTGAC CCGGATCGCC 
51181 GCGCGCCACG AGCCGCTGCG GACCGGGTTC CGCGATCGGG AACAGGTCGT CCGGCCGCCC 
51241 GCTCCGGTGC GCGCCGAGGT GGTTCCGGTG CCGGTCGGCG ACGTCGACGC CGCGGTCCGG 
51301 GTCGCCCACC GG GAG CTGAC CCGGCCGTTC GACCTCGTGA ACGGGTCGTT GCIGCGTGCC 
51361 GTGCTGCTGC CGCTGGGCGC CGAGGATCAC GTGCTGCTGC TGATGCTGCA CCACCTCGCC 
51421 GGTGACGGAT GGTCCTTCGA CCTCCTGGTC CGGGAGTTGT CGGGGACGCA ACCGGACCTT 
514 81 CCGGTGTCCT ACACGGACGT GGCCCGGTGG GAACGGAGTC CGGCCGTGAT CGCGGCCAGG 
51541 GAGAACGACC GGGCCTACTG GCGCCGGCGG CTGGGGGGCG CCACCGCGCC GGAGCTGCCC 
51601 GCGGTCCGGC CCGGCGGGGC ACCGACCGGG CGGGCGTTCC TGTGGACGCT CAAGGACACC 
51661 GCCGTCCTGG CGGCACGCCG GGTCGCGGAC GCCCACGACG CGACGTTGCA CGAAACCGTG 
51721 CTCGGCGCCT TCGCCCTGGT CGTGGCGGAG ACCGCCGACA CCGACGACGT GCTCGTCGCG 
51781 ACGCCGTTCG CGGACCGGGG GTACGCCGGG ACCGACCACC TCATCGGCTT CTTCGCGAAG 
51841 GTCCTCGCGC TGCGCCTCGA CCTCGGCGGC ACGCCGTCGT TCCCCGAGG? GCTGCGCCGG 
51901 GTGCACACCG CGATGGTGGG CGCGCACGCC CACCAGGCGG TGCCCTACTC CGCGCTGCGC 
51961 GCCGAGGACC CCGCGCTGCC GCCGGCCCCC GTGTCGTTCC AGCTCATCAG CGCGCTCAGC 
52021 GCGGAACTGC GGCTGCCCGG CATGCACACC GAGCCGTTCC CCGTCGTCGC CGAGACCGTC 
52081 GACGAGATGA CCGGCGAACT GTCGATCAAC CTCTTCGACG ACGGTCGCAC CGTCTCCGGC 
52141 GCGGTGGTCC ACGATGCCGC GCTGCTCGAC CGTGCCACCG TCGACGATTT GCTCACCCGG 
52201 GTGGAGGCGA CGCTGCGTGC CGCCGCGGGC GACCTCACCG TACGCGTCAC CGGTTACGTG 
52261 GAAAGCGAGT AGCCATGCCC GAGCAGGACA AGACAGTCGA GTACCTTCGC TGGGCGACCG 
52321 CGGAACTCCA GAAGACCCGT GCGGAACTCG CCGCGCACAG CGAGCCGTTG GCGATCGTGG 
52381 GGATGGCCTG CCGGCTGCCC GGCGGGGTCG CGTCGCCGGA GGACCTGTGG CAGTTGCTGG 
524 41 AGTCCGGTGG CGACGGCATC ACCGCGTTCC CCACGGACCG GGGCTGGGAG ACCACCGCCG 
52501 ACGGTCGCGG CGGCTTCCTC ACCGGGGCGG CCGGCTTCGA CGCGGCGTTC TTCGGCATCA 
52561 GCCCGCGCGA GGCGCTGGCG ATGGACCCGC AGCAGCGCCT GGCCCTGGAG ACCTCGTGGG 
52 621 AGGCGTTCGA GCACGCGGGC ATCGATCCGC AGACGCTGCG GGGCAGTGAC ACGGGGGTGT 
52681 TCCTCGGCGC GTTCTTCCAG GGGTACGGCA TCGGCGCCGA CTTCGACGGT TACGGCACCA 

527 41 CGAGCATTCA CACGAGCGTG CTCTCCGGCC GCCTCGCGTA CTTCTACGGT CTGGAGGGTC 
52801 CGGCGGTCAC GGTCGACACG GCGTGTTCGT CGTCGCTGGT GGCGCTGCAC CAGGCCGGGC 

528 61 AGTCGCTGCG CTCCGGCGAA TGCTCGCTCG CCCTGGTCGG CGGCGTCACG GTGATGGCCT 
52921 CGCCGGCGGG GTTCGCGGAC TTCTCCGAGC AGGGCGGCCT GGCCCCCGAC GCGCGCTGCA 
52 981 AGGCCTTCGC GGAAGCGGCT GACGGCACCG GTTTCGCCGA GGGGTCCGGC GTCCTGATCG 
53041 TCGAGAAGCT CTCCGACGCC GAGCGCAACG GCCACCGCGT GCTGGCGGTC GTCCGGGGTT 
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53101 CCGCCGTCAA CCAGGACGGT GCCTCCAACG GGCTGTCCGC GCCGAACGGG CCGTCGCAGG 
53161 AGCGGGTGAT CCGGCAGGCC CTGGCCAACG CCGGACTCAC CCCGGCGGAC GTGGACGCCG 
53221 TCGAGGCCCA CGGCACCGGC ACCAGGCTGG GCGACCCCAT CGAGGCACAG GCCGTGCTGG 
53281 CCACCTACGG GCAGGGGCGC GACACCCCTG TGCTGCTGGG CTCGCTGAAG TCCAACATCG 
53341 GCCACACCCA GGCCGCCGCG GGCGTCGCCG GTGTCATCAA GATGGTCCTC GCCATGCGGC 
534 01 ACGGCACCCT GCCCCGCACC CTGCACGTGG ACACGCCGTC CTCGCACGTC GACTGGACGG 
534 61 CCGGCGCCGT CGAACTCCTC ACCGACGCCC GGCCCTGGCC CGAAACCGAC CGCCCACGGC 
53521 GCGCCGGTGT CTCCTCCTTC GGCGTCAGCG GCACCAACGC CCACATCATC CTCGAAAGCC 
53581 ACCCCCGACC GGCCCCCGAA CCCGCCCCGG CACCCGACAC CGGACCGCTG CCGCTGCTGC 
53641 TCTCGGCCCG CACCCCGCAG GCACTCGACG CACAGGTACA CCGCCTGCGC GCGTTCCTCG 
53701 ACGACAACCC CGGCGCGGAC CGGGTCGCCG TCGCGCAGAC ACTCGCCCGG CGCACCCAGT 
537 61 TCGAGCACCG CGCCGTGCTG CTCGGCGACA CGCTCATCAC CGTGAGCCCG AACGCCGGCC 
53821 GCGGACCGGT GGTCTTCGTC TACTCGGGGC AAAGCACGCT GCACCCGCAC ACCGGGCGGC 
53881 AACTCGCGTC CACCTACCCC GTGTTCGCCG AAGCGTGGCG CGAGGCCCTC GACCACCTCG 
53941 ACCCCACCCA GGGCCCGGCC ACGCACTTCG CCCACCAGAC CGCGCTCACC GCGCTCCTGC 
54 001 GGTCCTGGGG CATCACCCCG CACGCGGTCA TCGGCCACTC CCTCGGTGAG ATCACCGCCG 
54061 CGCACGCCGC CGGTGTCCTG TCCCTGAGGG ACGCGGGCGC GCTCCTCACC ACCCGCACCC 
54121 GCCTGATGGA CCAACTGCCG TCGGGCGGCG CGATGGTCAC CGTCCTGACC AGCGAGGAAA 
54181 AGGCACGCCA GGTGCTGCGG CCGGGCGTGG AGATCGCCGC CGTCAACGGC CCCCACTCCC 
54241 TCGTGCTGTC CGGGGACGAG GAAGCCGTAC TCGAAGCCGC CCGGCAGCTC GGCATCCACC 
54 301 ACCGCCTGCC GACCCGCCAC GCCGGCCACT CCGAGCGCAT GCAGCCACTC GTCGCCCCCC 
54 361 TCCTCGACGT CGCCCGGACC CTGACGTACC ACCAGCCCCA CACCGCCATC CCCGGCGACC 
544 21 CCACCACCGC CGAATACTGG GCGCACCAGG TCCGCGACCA AGTACGTTTC CAGGCGCACA 
54 4 81 CCGAGCAGTA CCCGGGCGCG ACGTTCCTCG AGATCGGCCC CAACCAGGAC CTCTCGCCGC 
54 541 TCGTCGACGG CGTTGCCGCC CAGACCGGTA CGCCCGACGA GGTGCGGGCG CTGCACACCG 
54 601 CGCTCGCGCA GCTCCACG7C CGCGGCGTCG CGATCGACTG GACGCTCGTC CTCGGCGGGG 
54 661 ACCGCGCGCC CGTCACGCTG CCCACGTATC CGTTCCAGCA CAAGGACTAC TGGCTGCGGC 
54 7 21 CCACCTCCCG GGCCGATGTG ACCGGCGCGG GGCAGGAGCA GGTGGCGCAC CCGCTGCTCG 
54781 GCGCCGCGGT CGCGCTGCCC GGCACGGGCG GAGTCGTCCT GACCGGCCGC CTGTCGCTGG 
54 841 CCTCCCATCC GTGGCTCGGC GAGCACGCGG TCGACGGCAC CGTGCTCCTG CCCGGCGCGG 
54 901 CCTTCCTCGA ACTCGCGGCG CGCGCCGGCG ACGAGGTCGG CTGCGACCTG CTGCACGAAC 

54 961 TCGTCATCGA GACGCCGCTC GTGCTGCCCG CGACCGGCGG TGTGGCGGTC TCCGTCGAGA 
55021 TCGCCGAACC CGACGACACG GGGCGGCGGG CGGTCACCGT CCACGCGCGG GCCGACGGCT 
55081 CGGGCCTGTG GACCCGACAC GCCGGCGGAT TCCTCGGCAC GGCACCGGCA CCGGCCACGG 
55141 CCACGGACCC GGCACCCTGG CCGCCCGCGG AAGCCGGACC GGTCGACGTC GCCGACGTCT 
55201 ACGACCGGTT CGAGGACATC GGGTACTCCT ACGGACCGGG CTTCCGGGGG CTGCGGGCCG 
552 61 CCTGGCGCGC CGGCGACACC GTGTACGCCG AGGTCGCGCT CCCCGACGAG CAGAGCGCCG 
55321 ACGCCGCCCG TTTCACGCTG CACCCCGCGC TGCTCGACGC CGCGTTCCAG GCCGGCGCGC 
55381 TGGCCGCGCT CGACGCACCC GGCGGGGCGG CCCGACTGCC GTTCTCGTTC CAGGACGTCC 
554 41 GCATCCACGC GGCCGGGGCG ACGCGGCTGC GGGTCACGGT CGGCCGCGAC GGCGAGCGCA 
55501 GCACCGTCCG CATGACCGGC CCGGACGGGC AGCTGGTGGC CGTGGTCGGT GCCGTGCTGT 
55561 CGCGCCCGTA CGCGGAAGGC TCCGGTGACG GCCTGCTGCG CCCGGTCTGG ACCGAGCTGC 

55 621 CGATGCCCGT CCCGTCCGCG GACGATCCGC GCGTGGAGGT CCTCGGCGCC GACCCGGGCG 
55681 ACGGCGACGT TCCGGCGGCC ACCCGGGAGC TGACCGCCCG CGTCCTCGGC GCGCTCCAGC 
55741 GCCACCTGTC CGCCGCCGAG GACACCACCT TGGTGGTACG GACCGGCACC GGCCCGGCCG 
55801 CTGCCGCCGC CGCGGGTCTG GTCCGCTCGG CGCAGGCGGA GAACCCCGGC CGCGTCGTGC 
558 61 TCGTCGAGGC GTCCCCGGAC ACCTCGGTGG AGCTGCTCGC CGCGTGCGCC GCGCTGGACG 
55 921 AACCGCAGCT GGCCGTCCGG GACGGCGTGC TCTTCGCGCC GCGGCTGGTC CGGATGTCCG 
55981 ACCCCGCGCA CGGCCCGCTG TCCCTGCCGG ACGGCGACTG GCTGCTCACC CGGTCCGCCT 
5 6041 CCGGCACGTT GCACGACGTC GCGCTCATAG CCGACGACAC GCCCCGGCGG GCGCTCGAAG 
56101 CCGGCGAGGT CCGCATCGAC GTCCGCGCGG CCGGACTGAA CTTCCGCGAT GTGCTGATCG 
5 6161 CGCTCGGGAC GTACACCGGG GCCACGGCCA TGGGCGGCGA GGCCGCGGGC GTCGTGGTGG 
56221 AGACCGGGCC CGGCGTGGAC GACCTGTCCC CCGGCGACCG GGTGTTCGGC CTGACCCGGG 
56281 GCGGCATCGG CCCGACGGCC GTCACCGACC GGCGCTGGCT GGCCCGGATC CCCGACGGCT 
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56341 GGAGCTTCAC CACGGCGGCG TCCGTCCCGA TCGTGTTCGC GACCGCGTGG TACGGCCTGG 
56401 TCGACCTCGG CACACTGCGC GCCGGCGAGA AGGTCCTCGT CCACGCGGCC ACCGGCGGTG 
564 61 TCGGCATGGC CGCCGCACAG ATCGCCCGCC ACCTGGGCGC CGAGCTCTAC GCCACCGCCA 
56521 GTACCGGCAA GCAGCACGTC CTGCGCGCCG CCGGGCTGCC CGACACGCAC ATCGCCGACT 
5 56581 CTCGGACGAC CGCGTTCCGG ACCGCTTTCC CGCGCATGGA CGTCGTCCTG AACGCGCTGA 
56641 CCGGCGAGTT CATCGACGCG TCGCTCGACC TGCTGGACGC CGACGGCCGG TTCGTCGAGA 
56701 TGGGCCGCAC CGAGCTGCGC GACCCGGCCG CGATCGTCCC CGCCTACCTG CCGTTCGACC 
567 61 TGCTGGACGC GGGCGCCGAC CGCATCGGCG AGATCCTGGG CGAACTGCTC CGGCTGTTCG 
56821 ACGCGGGCGC GCTGGAGCCG CTGCCGGTCC GTGCCTGGGA CGTCCGGCAG GCACGCGACG 

10 56881 CGCTCGGCTG GATGAGCCGC GCCCGCCACA TCGGCAAGAA CGTCCTGACG CTGCCCCGGC 
56941 CGCTCGACCC GGAGGGCGCC GTCGTCCTCA CCGGCGGCTC CGGCACGCTC GCCGGCATCC 
57001 TCGCCCGCCA CCTGCGCGAA CGGCATGTCT ACCTGCTGTC CCGGACGGCA CCGCCCGAGG 
57061 GGACGCCCGG CGTCCACCTG CCCTGCGACG TCGGTGACCG GGACCAGCTG GCGGCGGCCC 
57121 TGGAGCGGGT GGACCGGCCG ATCACCGCCG TGGTGCACCT CGCCGGTGCG CTGGACGACG 

15 57181 GCACCGTCGC GTCGCTCACC CCCGAGCGTT TCGACACGGT GCTGCGCCCG AAGGCCGACG 
57241 GCGCCTGGTA CCTGCACGAG CTGACGAAGG AGCAGGACCT CGCCGCGTTC GTGCTCTACT 
57301 CGTCGGCCGC CGGCGTGCTC GGCAACGCCG GCCAGGGCAA CTACGTCGCC GCGAACGCGT 
57361 TCCTCGACGC GCTCGCCGAG CTGCGCCACG GTTCCGGGCT GCCGGCCCTC TCCATCGCCT 
57 421 GGGGGCTCTG GGAGGACGTG AGCGGGCTCA CCGCGGCGCT CGGCGAAGCC GACCGGGACC 

20 57 481 GGATGCGGCG CAGCGGTTTC CGGGCCATCA CCGCGCAACA GGGCATGCAC CTGTACGAGG 
57 541 CGGCCGGCCG CACCGGAAGT CCCGTGGTGG TCGCGGCGGC GCTCGACGAC GCGCCGGACG 
57 601 TGCCGCTGCT GCGCGGCCTG CGGCGGACGA CCGTCCGGCG GGCCGCCGTC CGGGAGTGTT 
57 661 CGTCCGCCGA CCGGCTCGCC GCGCTGACCG GCGACGAGCT CGCCGAAGCG CTGCTGACGC 
57721 TCGTCCGGGA GAGCACCGCC GCCGTGCTCG GCCACGTGGG TGGCGAGGAC ATCCCCGCGA 

25 57781 CGGCGGCGTT CAAGGACCTC GGCATCGACT CGCTCACCGC GGTCCAGCTG CGCAACGCCC 
57 841 TCACCGAGGC GACCGGTGTG CGGCTGAACG CCACGGCGGT CTTCGACTTC CCGACCCCGC 
57 901 ACGTGCTCGC CGGGAAGCTC GGCGACGAAC TGACCGGCAC CCGCGCGCCC GTCGTGCCCC 

57 961 GGACCGCGGC CACGGCCGGT GCGCACGACG AGCCGCTGGC GATCGTGGGA ATGGCCTGCC 
58021 GGCTGCCCGG CGGGGTCGCG T-CACCCGAGG AGCTGTGGCA CCTCGTGGCA TCCGGCACCG 

30 58081 ACGCCATCAC GGAGTTCCCG ACGGACCGCG GCTGGGACGT CGACGCGATC TACGACCCGG 
58141 ACCCCGACGC GATCGGCAAG ACCTTCGTCC GGCACGGTGG CTTCCTCACC GGCGCGACAG 
58201 GCTTCGACGC GGCGTTCTTC GGCATCAGCC CGCGCGAGGC CCTCGCGATG GACCCGCAGC 
58261 AGCGGGTGCT CCTGGAGACG TCGTGGGAGG CGTTCGAAAG CGCCGGCATC ACCCCGGACT 
58321 CGACCCGCGG CAGCGACACC GGCGTGTTCG TCGGCGCCTT CTCCTACGGT TACGGCACCG 

35 58381 GTGCGGACAC CGACGGCTTC GGCGCGACCG GCTCGCAGAC CAGTGTGCTC TCCGGCCGGC 
584 41 TGTCGTACTT CTACGGTCTG GAGGGTCCGG CGGTCACGGT CGACACGGCG TGTTCGTCGT 
58501 CGCTGGTGGC GCTGCACCAG GCCGGGCAGT CGCTGCGCTC CGGCGAATGC TCGCTCGCCC 
58561 TGGTCGGCGG CGTCACGGTG ATGGCGTCTC CCGGCGGCTT CGTGGAGTTC TCCCGGCAGC 

58 621 GCGGCCTCGC GCCGGACGGC CGGGCGAAGG CGTTCGGCGC GGGTGCGGAC GGCACGAGCT 
40 58 681 TCGCCGAGGG TGCCGGTGTG CTGATCGTCG AGAGGCTCTC CGACGCCGAA CGCAACGGTC 

58741 ACACCGTCCT GGCGGTCGTC CGTGGTTCGG CGGTCAACCA GGATGGTGCC TCCAACGGGC 
58801 TGTCGGCGCC GAACGGGCCG TCGCAGGAGC GGGTGATCCG GCAGGCCCTG GCCAACGCCG 
588 61 GGCTCACCCC GGCGGACGTG GACGCCGTCG AGGCCCACGG CACCGGCACC AGGCTGGGCG 
58 921 ACCCCATCGA GGCACAGGCG GTACTGGCCA CCTACGGACA GGAGCGCGCC ACCCCCCTGC 

45 58 981 TGCTGGGCTC GCTGAAGTCC AACATCGGCC ACGCCCAGGC CGCGTCCGGC GTCGCCGGCA 
59041 TCATCAAGAT GGTGCAGGCC CTCCGGCACG GGGAGCTGCC GCCGACGCTG CACGCCGACG 
59101 AGCCGTCGCC GCACGTCGAC TGGACGGCCG GCGCCGTCGA ACTGCTGACG TCGGCCCGGC 
59161 CGTGGCCCGA GACCGACCGG CCACGGCGTG CCGCCGTCTC CTCGTTCGGG GTGAGCGGCA 
59221 CCAACGCCCA CGTCATCCTG GAGGCCGGAC CGGTAACGGA GACGCCCGCG GCATCGCCTT 

50 59281 CCGGTGACCT TCCCCTGCTG GTGTCGGCAC GCTCACCGGA AGCGCTCGAC GAGCAGATCC 
59341 GCCGACTGCG CGCCTACCTG GACACCACCC CGGACGTCGA CCGGGTGGCC GTGGCACAGA 
59401 CGCTGGCCCG GCGCACACAC TTCGCCCACC GCGCCGTGCT GCTCGGTGAC ACCGTCATCA 
59461 CCACACCCCC CGCGGACCGG CCCGACGAAC TCGTCTTCGT CTACTCCGGC CAGGGCACCC 
59521 AGCATCCCGC GATGGGCGAG CAGCTCGCCG CCGCCCATCC CGTGTTCGCC GACGCCTGGC 
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59581 ATGAAGCGCT CCGCCGCCTT GACAACCCCG ACCCCCACGA CCCCACGCAC AGCCAGCATG 
59641 TGCTCTTCGC CCACCAGGCG GCGTTCACCG CCCTCCTGCG GTCCTGGGGC ATCACCCCGC 
59701 ACGCGGTCAT CGGCCACTCG CTGGGCGAGA TCACCGCGGC GCACGCCGCC GGCATCCTGT 
597 61 CGCTGGACGA CGCGTGCACC CTGATCACCA CGCGCGCCCG CCTCATGCAC ACGCTCCCGC 
59821 CACCCGGTGC CATGGTCACC GTACTGACCA GCGAAGAGAA GGCACGCCAG GCGTTGCGGC 
59881 CGGGCGTGGA GATCGCCGCC GTCAACGGGC CCCACTCCAT CGTGCTGTCC GGGGACGAGG 
59941 ACGCCGTGCT CACCGTCGCC GGGCAGCTCG GCATCCACCA CCGCCTGCCC GCCCCGCACG 
60001 CCGGGCACTC CGCGCACATG GAGCCCGTGG CCGCCGAGCT GCTCGCCACC ACCCGCGGGC 
60061 TCCGCTACCA CCCTCCCCAC ACCTCCATTC CGAACGACCC CACCACCGCT GAGTACTGGG 
60121 CCGAGCAGGT CCGCAAGCCC GTGCTGTTCC ACGCCCACGC GCAGCAGTAC CCGGACGCCG 
60181 TGTTCGTGGA GATCGGCCCC GCCCAGGACC TCTCCCCGCT CGTCGACGGG ATCCCGCTGC 
60241 AGAACGGCAC CGCGGACGAG GTGCACGCGC TGCACACCGC GCTCGCGCAC CTCTACGCGC 
60301 GCGGTGCCAC GCTCGACTGG CCCCGCATCC TCGGGGCTGG GTCACGGCAC GACGCGGATG 
60361 TGCCCGCGTA CGCGTTCCAA CGGCGGCACT ACTGGATCGA GTCGGCACGC CCGGCCGCAT 
60421 CCGACGCGGG CCACCCCGTG CTGGGCTCCG GTATCGCCCT CGCCGGGTCG CCGGGCCGGG 
604 81 TGTTCACGGG TTCCGTGCCG ACCGGTGCGG ACCGCGCGGT GTTCGTCGCC GAGCTGGCGC 
60541 TGGCCGCCGC GGACGCGGTC GACTGCGCCA CGGTCGAGCG GCTCGACATC GCCTCCGTGC 
60601 CCGGCCGGCC GGGCCATGGC CGGACGACCG TACAGACCTG GGTCGACGAG CCGGCGGACG 
60661 ACGGCCGGCG CCGGTTCACC GTGCACACCC GCACCGGCGA CGCCCCGTGG ACGCTGCACG 
60721 CCGAGGGGGT GCTGCGCCCC CATGGCACGG CCCTGCCCGA TGCGGCCGAC GCCGAGTGGC 
60781 CCCCACCGGG CGCGGTGCCC GCGGACGGGC TGCCGGGTGT GTGGCGCCGG GGGGACCAGG 
60841 TCTTCGCCGA GGCCGAGGTG GACGGACCGG ACGGTTTCGT GGTGCACCCC GACCTGCTCG 
60901 ACGCGGTCTT CTCCGCGGTC GGCGACGGAA GCCGCCAGCC GGCCGGATGG CGCGACCTGA 
60961 CGGTGCACGC GTCGGACGCC ACCGTACTGC GCGCCTGCCT CACCCGGCGC ACCGACGGAG 
61021 CCATGGGATT CGCCGCCTTC GACGGCGCCG GCCTGCCGGT ACTCACCGCG GAGGCGGTGA 
61081 CGCTGCGGGA GGTGGCGTCA CCGTCCGGCT CCGAGGAGTC GGACGGCCTG CACCGGTTGG 
61141 AGTGGCTCGC GGTCGCCGAG GCGGTCTACG ACGGTGACCT GCCCGAGGGA CATGTCCTGA 
61201 TCACCGCCGC CCACCCCGAC GACCCCGAGG ACATACCCAC CCGCGCCCAC ACCCGCGCCA 
612 61 CCCGCGTCCT GACCGCCCTG CAACACCACC TCACCACCAC CGACCACACC CTCATCGTCC 
61321 ACACCACCAC CGACCCCGCC GGCGCCACCG TCACCGGCCT CACCCGCACC GCCCAGAACG 
61381 AACACCCCCA CCGCATCCGC CTCATCGAAA CCGACCACCC CCACACCCCC CTCCCCCTGG 
614 41 CCCAACTCGC CACCCTCGAC CACCCCCACC TCCGCCTCAC CCACCACACC CTCCACCACC 
61501 CCCACCTCAC CCCCCTCCAC ACCACCACCC CACCCACCAC CACCCCCCTC AACCCCGAAC 
61561 ACGCCATCAT CATCACCGGC GGCTCCGGCA CCCTCGCCGG CATCCTCGCC CGCCACCTGA 
61621 ACCACCCCCA CACCTACCTC CTCTCCCGCA CCCCACCCCC CGACGCCACC CCCGGCACCC 
61681 ACCTCCCCTG CGACGTCGGC GACCCCCACC AACTCGCCAC CACCCTCACC CACATCCCCC 
617 41 AACCCCTCAC CGCCATCTTC CACACCGCCG CCACCCTCGA CGACGGCATC CTCCACGCCC 
61801 TCACCCCCGA CCGCCTCACC ACCGTCCTCC ACCCCAAAGC CAACGCCGCC TGGCACCTGC 
61861 ACCACCTCAC CCAAAACCAA CCCCTCACCC ACTTCGTCCT CTACTCCAGC GCCGCCGCCG 
61921 TCCTCGGCAG CCCCGGACAA GGAAACTACG CCGCCGCCAA CGCCTTCCTC GACGCCCTCG 
61981 CCACCCACCG CCACACCCTC GGCCAACCCG CCACCTCCAT CGCCTGGGGC ATGTGGCACA 
62041 CCACCAGCAC CCTCACCGGA CAACTCGACG ACGCCGACCG GGACCGCATC CGCCGCGGCG 
62101 GTTTCCTCCC GATCACGGAC GACGAGGGCA TGCGCCTCTA CGAGGCGGCC GTCGGCTCCG 
62161 GCGAGGACTT CGTCATGGCC GCCGCGATGG ACCCGGCACA GCCGATGACC GGCTCCGTAC 
62221 CGCCCATCCT GAGCGGCCTG CGCAGGAGCG CGCGGCGCGT CGCCCGTGCC GGGCAGACGT 
62281 TCGCCCAGCG GCTCGCCGAG CTGCCCGACG CCGACCGCGG CGCGGCGCTG ACCACCCTCG 
62341 TCTCGGACGC CACGGCCGCC GTGCTCGGCC ACGCCGACGC CTCCGAGATC GCGCCGACCA 
624 01 CGACGTTCAA GGACCTCGGC ATCGACTCGC TCACCGCGAT CGAGCTGCGC AACCGGCTCG 
624 61 CGGAGGCGAC CGGGCTGCGG CTGAGTGCCA CGCTGGTGTT CGACCACCCG ACACCTCGGG 
62521 TCCTCGCCGC CAAGCTCCGC ACCGATCTGT TCGGCACGGC CGTGCCCACG CCCGCGCGGA 
62581 CGGCACGGAC CCACCACGAC GAGCCACTCG CGATCGTCGG CATGGCGTGC CGACTGCCCG 
62 641 GCGGGGTCGC CTCGCCGGAG GACCTGTGGC AGCTCGTGGC GTCCGGCACC GACGCGATCA 
62701 CCGAGTTCCC CACCGACCGC GGCTGGGACA TCGACCGGCT GTTCGACCCG GACCCGGACG 
627 61 CCCCCGGCAA GACCTACGTC CGGCACGGCG GCTTCCTCGC CGAGGCCGCC GGCTTCGATG 
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62821 CCGCGTTCTT CGGCATCAGC CCGCGCGAGG CACGGGCCAT GGACCCGCAG CAGCGCGTCA 
62881 TCCTCGAAAC CTCCTGGGAG GCGTTCGAGA ACGCGGGCAT CGTGCCGGAC ACGCTGCGCG 
62941 GCAGCGACAC CGGCGTGTTC ATGGGCGCGT TCTCCCATGG GTACGGCGCC GGCGTCGACC 
63001 TGGGCGGGTT CGGCGCCACC GCCACGCAGA ACAGCGTGCT CTCCGGCCGG TTGTCGTACT 
5 63061 TCTTCGGCAT GGAGGGCCCG GCCGTCACCG TCGACACCGC CTGCTCGTCG TCGCTGGTCG 
63121 CCCTGCACCA GGCGGCACAG GCGCTGCGGA CTGGAGAATG CTCGCTGGCG CTCGCCGGCG 
63181 GTGTCACGGT GATGCCCACC CCGCTGGGCT ACGTCGAGTT CTGCCGCCAG CGGGGACTCG 
63241 CCCCCGACGG CCGTTGCCAG GCCTTCGCGG AAGGCGCCGA CGGCACGAGC TTCTCGGAGG 
63301 GCGCCGGCGT TCTTGTGCTG GAGCGGCTCT CCGACGCCGA GCGCAACGGA CACACCGTCC 

10 63361 TCGCGGTCGT CCGCTCCTCC GCCGTCAACC AGGACGGCGC CTCCAACGGC ATCTCCGCAC 
63421 CCAACGGCCC CTCCCAGCAG CGCGTCATCC GCCAGGCCCT CGACAAGGCC GGGCTCGCCC 
634 81 CCGCCGACGT GGACGTGGTG GAGGCCCACG GCACCGGAAC CCCGCTGGGC GACCCGATCG 
63541 AGGCACAGGC CATCATCGCG ACCTACGGCC AGGACCGCGA CACACCGCTC TACCTCGGTT 
63601 CGGTCAAGTC GAACATCGGA CACACCCAGA CCACCGCCGG TGTCGCCGGC GTCATCAAGA 

15 63661 TGGTCATGGC GATGCGCCAC GGCATCGCGC CGAAGACACT GCACGTGGAC GAGCCGTCGT 
63721 CGCATGTGGA CTGGACCGAG GGTGCGGTGG AACTGCTCAC CGAGGCGAGG CCGTGGCCCG 
637 81 ACGCGGGACG CCCGCGCCGC GCGGGCGTGT CGTCGCTCGG TATCAGCGGT ACGAACGCCC 
63841 ACGTGATCCT TGAGGGTGTT CCCGGGCCGT CGCGTGTGGA GCCGTCTGTT GACGGGTTGG 
63901 TGCCGTTGCC GGTGTCGGCT CGGAGTGAGG CGAGTCTGCG GGGGCAGGTG GAGCGGCTGG 

20 63961 AGGGGTATCT GCGCGGGAGT GTGGATGTGG CCGCGGTCGC GCAGGGGTTG GTGCGTGAGC 
64021 GTGCTGTCTT CGGTCACCGT GCGGTACTGC TGGGTGATGC CCGGGTGATG GGTGTGGCGG 
64 081 TGGATCAGCC GCGTACGGTG TTCGTCTTTC CCGGGCAGGG TGCTCAGTGG GTGGGCATGG 
64141 GTGTGGAGTT GATGGACCGT TCTGCGGTGT TCGCGGCTCG TATGGAGGAG TGTGCGCGGG 
64201 CGTTGTTGCC GCACACGGGC TGGGATGTGC GGGAGATGTT GGCGCGGCCG GATGTGGCGG 

25 642 61 AGCGGGTGGA GGTGGTCCAG CCGGCCAGCT GGGCGGTCGC GGTCAGCCTG GCCGCACTGT 
64 321 GGCAGGCCCA CGGGGTCGTA CCCGACGCGG TGATCGGACA CTCCCAGGGC GAGATCGCGG 
64381 CGGCGTGCGT GGCCGGGGCC CTCAGCCTTG AGGACGGCGC CCGCGTGGTG GCCTTGCGCA 
64 441 GCCAGGTCAT CGCGGCGCGA CTGGCCGGGC GGGGAGCGAT GGCTTCGGTG GCATTGCCGG 
64501 CCGGTGAGGT CGGTCTGGTC GAGGGCGTGT GGATCGCGGC GCGTAACGGC CCCGCCTCGA 

30 64561 CAGTCGTGGC CGGCGAGCCG TCGGCGGTGG AGGACGTGGT GACGCGGTAT GAGACCGAAG 
64 621 GCGTGCGAGT GCGTCGTATC GCCGTCGACT ACGCCTCCCA CACGCCCCAC GTGGAAGCCA 
64 681 TCGAGGACGA ACTCGCTGAG GTACTGAAGG GAGTTGCAGG GAAGGCCGCG TCGGTGGCGT 
64741 GGTGGTCGAC CGTGGACAGC GCCTGGGTGA CCGAGCCGGT G GAT GAG AG 7 TACTGGTACC 
64801 GGAACCTGCG TCGCCCCGTC GCGCTGGACG CGGCGGTGGC GGAGCTGGAC GGGTCCGTGT 

35 64861 TCGTGGAGTG CAGCGCCCAT CCGGTGCTGC TGCCGGCGAT GGAACAGGCC CACACGGTGG 
64 921 CGTCGTTGCG CACCGGTGAC GGCGGCTGGG AGCGATGGCT GACGGCGTTG GCGCAGGCGT 
64 981 GGACCCTGGG CGCGGCAGTG GACTGGGACA CGGTGGTCGA ACCGGTGCCA GGGCGGCTGC 
65041 TCGATCTGCC CACCTACGCG TTCGAGCGCC GGCGCTACTG GCTGGAAGCG GCCGGTGCCA 
65101 CCGACCTGTC CGCGGCCGGG CTGACAGGGG CAGCACATCC CATGCTGGCC GCCATCACGG 

40 65161 CACTACCCGC CGACGACGGT GGTGTTGTTC TCACCGGCCG GATCTCGTTG CGCACGCATC 
65221 CCTGGCTGGC TGATCACGCG GTGCGGGGCA CGGTCCTGCT GCCGGGCACG GCCTTTGTGG 
65281 AGCTGGTCAT CCGGGCCGGT GACGAGACCG GTTGCGGGAT AGTGGATGAA CTGGTCATCG 
65341 AATCCCCCCT CGTGGTGCCG GCGACCGCAG CCGTGGATCT GTCGGTGACC GTGGAAGGAG 
654 01 CTGACGAGGC CGGACGGCGG CGAGTGACCG TCCACGCCCG CACCGAAGGC ACCGGCAGCT 

45 654 61 GGACCCGGCA CGCCAGCGGC ACCCTGACCC CCGACACCCC CGACACCCCC AACGCTTCCG 
65521 GTGTTGTCGG TGCGGAGCCG TTCTCGCAGT GGCCACCTGC CACTGCCGCG GCCGTCGACA 
65581 CCTCGGAGTT CTACTTGCGC CTGGACGCGC TGGGCTACCG GTTCGGACCC ATGTTCCGCG 
65641 GAATGCGGGC TGCCTGGCGT GATGGTGACA CCGTGTACGC CGAGGTCGCG CTCCCCGAGG 
657 01 ACCGTGCCGC CGACGCGGAC GGTTTCGGCA TGCACCCGGC GCTGCTCGAC GCGGCCTTGC 

50 657 61 AGAGCGGCAG CCTGCTCATG CTGGAATCGG ACGGCGAGCA GAGCGTGCAA CTGCCGTTCT 
65821 CCTGGCACGG CGTCCGGTTC CACGCGACGG GCGCGACCAT GCTGCGGGTG GCGGTCGTAC 
65881 CGGGCCCGGA CGGCCTCCGG CTGCATGCCG CGGACAGCGG GAACCGTCCC GTCGCGACGA 
65941 TCGACGCGCT CGTGACCCGG TCCCCGGAAG CGGACCTCGC GCCCGCCGAT CCGATGCTGC 
660 01 GGGTCGGGTG GGCCCCGGTG CCCGTACCTG CCGGGGCCGG TCCGTCCGAC GCGGACGTGC 
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66061 TGACGCTGCG CGGCGACGAC GCCGACCCGC TCGGGGAGAC CCGGGACCTG ACCACCCGTG 
66121 TTCTCGACGC GCTGCTCCGG GCCGACCGGC CGGTGATCTT CCAGGTGACC GGTGGCCTCG 
66181 CCGCCAAGGC GGCCGCAGGC CTGGTCCGCA CCGCTCAGAA CGAGCAGCCC GGCCGCTTCT 
66241 TCCTCGTCGA AACGGACCCG GGAGAGGTCC TGGACGGCGC GAAGCGCGAC GCGATCGCGG 
66301 CACTCGGCGA GCCCCATGTG CGGCTGCGCG ACGGCCTCTT CGAGGCAGCC CGGCTGATGC 
66361 GGGCCACGCC GTCCCTGACG CTCCCGGACA CCGGGTCGTG GCAGCTGCGG CCGTCCGCCA 
66421 CCGGTTCCCT CGACGACCTT GCCGTCGTCC CCACCGACGC CCCGGACCGG CCGCTCGCGG 
664 81 CCGGCGAGGT GCGGATCGCG GTACGCGCGG CGGGCCTGAA CTTCCGGGAT GTCACGGTCG 
66541 CGCTCGGTGT GGTCGCCGAT GCGCGTCCGC TCGGCAGCGA GGCCGCGGGT GTCGTCCTGG 
66601 AGACCGGCCC CGGTGTGCAC GACCTGGCGC CCGGCGACCG GGTCCTGGGG ATGCTCGCGG 
66661 GCGCCTTCGG ACCGGTCGCG ATCACCGACC GGCGGCTGCT CGGCCGGATG CCGGACGGCT 
667 21 GGACGTTCCC GCAGGCGGCG TCCGTGATGA CCGCGTTCGC GACCGCGTGG TACGGCCTGG 
667 81 TCGACCTGGC CGGGCTGCGC CCCGGCGAGA AGGTCCTGAT CCACGCGGCG GCGACCGGTG 
66841 TCGGCGCGGC GGCCGTCCAG ATCGCGCGGC ATCTGGGCGC GGAGGTGTAC GCGACCACCA 
66901 GCGCCGCGAA GCGCCATCTG GTGGACCTGG ACGGAGCGCA TCTGGCCGAT TCCCGCAGCA 
66961 CCGCGTTCGC CGACGCGTTC CCGCCGGTCG ATGTCGTGCT CAACTCGCTC ACCGGTGAAT 
67021 TCCTCGACGC GTCCGTCGGC CTGCTCGCGG CGGGTGGCCG GTTCATCGAG ATGGGGAAGA 
67081 CGGACATCCG GCACGCCGTC CAGCAGCCGT TCGACCTGAT GGACGCCGGC CCCGACCGGA 
6714 T TGCAGCGGAT CATCGTCGAG CTGCTCGGCC TGTTCGCGCG CGACGTGCTG CACCCGCTGC 
67201 CGC-TCCACGC CTGGGACGTG CGGCAGGCGC GGGAGGCGTT CGGCTGGATG AGCAGCGGGC 
67261 GTCACACCGG CAAGCTGGTG CTGACGGTCC CGCGGCCGCT GGATCCCGAG GGGGCCGTCG 
67321 TCATCACCGG CGGCTCCGGC ACCCTCGCCG GCATCCTCGC CCGCCACCTG GGCCACCCCC 
67381 ACACCTACCT GCTCTCCCGC ACCCCACCCC CCGACACCAC CCCCGGCACC CACCTCCCCT 
674 41 GCGACGTCGG CGACCCCCAC CAACTCGCCA CCACCCTCGC CCGCATCCCC CAACCCCTCA 
67 501 CCGCCGTCTT CCACACCGCC GGAACCCTCG ACGACGCCCT GCTCGACAAC CTCACCCCCG 
67561 ACCGCGTCGA CACCGTCCTC AAACCCAAGG CCGACGCCGC CTGGCACCTG CACCGGCTCA 
67 621 CCCGCGACAC CGACCTCGCC GCGTTCGTCG TCTACTCCGC GGTCGCCGGC CTCATGGGCA 
67681 GCCCGGGGCA GGGCAACTAC GTCGCGGCGA ACGCGTTCCT CGACGCGCTC GCCGAACACC 
67741 GCCGTGCGCA AGGGCTGCCC GCGCAGTCCC TCGCATGGGG CATGTGGGCG GACGTCAGCG 
67801 CGCTCACCGC GAAACTCACC GACGCGGACC GCCAGCGCAT CCGGCGCAGC GGATTCCCGC 
67861 CGTTGAGCGC CGCGGACGGC ATGCGGCTGT TCGACGCGGC GACGCGTACC CCGGAACCGG 
67 921 TCGTCGTCGC GACGACCGTC GACCTCACCC AGCTCGACGG CGCCGTCGCG CCGTTGCTCC 

67 981 GCGGTCTGGC CGCGCACCGG GCCGGGCCGG CGCGCACGGT CGCCCGCAAC GCCGGCGAAG 
68041 AGCCCCTGGC CGTGCGTCTT GCCGGGCGTA CCGCCGCCGA GCAGCGGCGC ATCATGCAGG 
68101 AGGTCGTGCT CCGCCACGCG GCCGCGGTCC TCGCGTACGG GCTGGGCGAC CGCGTGGCGG 
68161 CGGACCGTCC GTTCCGCGAG CTCGGTTTCG ATTCGCTGAC CGCGGTCGAC CTGCGCAATC 
68221 GGCTCGCGGC CGAGACGGGG CTGCGGCTGC CGACGACGCT GGTGTTCAGC CACCCGACGG 
68281 CGGAGGCGCT CACCGCCCAC CTGCTCGACC TGATCGACGC TCCCACCGCC CGGATCGCCG 

68 341 GGGAGTCCC? GCCCGCGGTG ACGGCCGCTC CCGTGGCGGC CGCGCGGGAC CAGGACGAGC 
68401 CGATCGCCAT CGTGGCGATG GCGTGCCGGC TGCCCGGTGG TGTGACGTCG CCCGAGGACC 
684 61 TGTGGCGGCT CGTCGAGTCC GGCACCGACG CGATCACCAC GCCTCCTGAC GACCGCGGCT 
68521 GGC-ACGTCGA CGCGCTGTAC GACGCGGACC CGGACGCGGC CGGCAAGGCG TACAACCTGC 
68 581 GGGGCGGTTA CCTGGCCGGG GCGGCGGAGT TCGACGCGGC GTTCTTCGAC ATCAGTCCGC 
68 641 GCGAAGCGCT CGGCATGGAC CCGCAGCAAC GCCTGCTGCT CGAAACGGCG TGGGAGGCGA 
68701 TC3AGCGCGG CCGGATCAGT CCGGCGTCGC TCCGCGGCCG GGAGGTCGGC GTCTATGTCG 
687 61 GTGCGGCCGC GCAGGGCTAC GGGCTGGGCG CCGAGGACAC CGAGGGCCAC GCGATCACCG 
68821 GTC-GTTCCAC GAGCCTGCTG TCCGGACGGC TGGCGTACGT GCTCGGGCTG GAGGGCCCGG 
68881 CGGTCACCGT GGACACGGCG TGCTCGTCGT CTCTGGTCGC GCTGCATCTG GCGTGCCAGG 
68 941 GGCTGCGCCT GGGCGAGTGC GAACTCGCTC TGGCCGGAGG GGTCTCCGTA CTGAGTTCGC 
69001 CGGCCGCGTT CGTGGAGTTC TCCCGCCAGC GCGGGCTCGC GGCCGACGGG CGCTGCAAGT 
69061 CGTTCGGCGC GGGCGCGGAC GGCACGACGT GGTCCGAGGG CGTGGGCGTG CTCGTACTGG 
69121 AACGGCTCTC CGACGCCGAG CGGCTCGGGC ACACCGTGCT CGCCGTCGTC CGCGGCAGCG 
69181 CCGTCACGTC CGACGGCGCC TCCAACGGCC TCACCGCGCC GAACGGGCTC TCGCAGCAGC 
69241 GGGTCATCCG GAAGGCGCTC GCCGCGGCCG GGCTGACCGG CGCCGACGTG GACGTCGTCG 
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69301 AGGGGCACGG CACCGGCACC CGGCTCGGCG ACCCGGTCGA GGCGGACGCG CTGCTCGCGA 
69361 CGTACGGGCA GGACCGTCCG GCACCGGTCT GGCTGGGCTC GCTGAAGTCG AACATCGGAC 
69421 ATGCCACGGC CGCGGCCGGT GTCGCGGGCG TCATCAAGAT GGTGCAGGCG ATCGGCGCGG 
694 81 GCACGATGCC GCGGACGCTG CATGTGGAGG AGCCCTCGCC CGCCGTCGAC TGGAGCACCG 
69541 GACAGGTGTC CCTGCTCGGC TCCAACCGGC CCTGGCCGGA CGACGAGCGT CCGCGCCGGG 
69601 CGGCCGTCTC CGCGTUCGGG CTCAGCGGGA CGAACGCGCA CGTCATCCTG GAACAGCACC 
69661 GTCCGGCGCC CGTGGCGTCC CAGCCGCCCC GGCCGCCCCG TGAGGAGTCC CAGCCGCTGC 
69721 CGTGGGTGCT CTCCGCGCGG ACTCCGGCCG CGCTGCGGGC CCAGGCGGCC CGGCTGCGCG 
69781 ACCACCTCGC GGCGGCACCG GACGCGGATC CGTTGGACAT CGGGTACGCG CTGGCCACCA 
69841 GCCGCGCCCA GTTCGCCCAC CGTGCCGCGG TCGTCGCCAC CACCCCGGAC GGATTCCGTG 
69901 CCGCGCTCGA CGGCCTCGCG GACGGCGCGG AGGCGCCCGG AGTCGTCACC GGGACCGCTC 
69961 AGGAGCGGCG CGTCGCCTTC CTCTTCGACG GCCAGGGCGC CCAGCGCGCC GGAATGGGGC 
7 0021 GCGAGCTCCA CCGCCGGTTC CCCGTCTTCG CCGCCGCGTG GGACGAGGTC TCCGACGCGT 
7 0081 TCGGCAAGCA CCTCAAGCAC TCCCCCACGG ACGTCTACCA CGGCGAACAC GGCGCTCTCG 
7 0141 CCCATGACAC CCTGTACGCC CAGGCCGGCC TGTTCACGCT CGAAGTGGCG CTGCTGCGGC 
70201 TGCTGGAGCA CTGGGGGGTG CGGCCGGACG TGCTCGTCGG GCACTCCGTC GGCGAGGTGA 
7 02 61 CCGCGGCGTA CGCGGCGGGG GTGCTCACCC TGGCGGACGC GACGGAGTTG ATCGTGGCCC 
7 0321 GGGGGCGGGC GCTGCGGGCG CTGCCGCCCG GGGCGATGCT CGCCGTCGAC GGAAGCCCGG 
7 0381 CGGAGGTCGG CGCCCGCACG GATCTGGACA TCGCCGCGGT CAACGGCCCG TCCGCCGTGG 
7 0441 TGCTCGCCGG TTCGCCGGAC GATGTGGCGG CGTTCGAACG GGAGTGGTCG GCGGCCGGGC 
7 0501 GGCGCACGAA ACGGCTCGAC GTCGGGCACG CGTTCCACTC CCGGCACGTC GACGGTGCGC 
70561 TCGACGGCTT CCGTACGGTG CTGGAGTCGC TCGCGTTCGG CGCGGCGCGG CTGCCGGTGG 
7 0621 TGTCCACGAC GACGGGCCGG GACGGCGCGG ACGACCTCAT AACGCCCGCG CACTGGCTGC 
7 0681 GCCATGCGCG TCGGCCGGTG CTGTTCTCGG ATGCCGTCCG GGAGCTGGCC GACCGCGGCG 
7 0741 TCACCACGTT CGTGGCCGTC GGCCCCTCCG GCTCCCTGGC GTCGGCCGCG GCGGAGAGCG 
7 0801 CCGGGGAGGA CGCCGGGACC TACCACGCGG TGCTGCGCGC CCGGACCGGT GAG G AG AC CG 
708 61 CGGCGCTGAC CGCCCTCGCC GAGCTGCACG CCCACGGCGT CCCGGTCGAC CTGGCCGCGG 
7 0921 TACTGGCCGG TGGCCGGCCA GTGGACCTTC CCGTGTACGC GTTCCAGCAC CGTTCCTACT 
70981 GGCTGGCCCC GGCCGTGGCG GGGGCGCCGG CCACCGTGGC GGACACCGGG GGTCCGGCGG 
71041 AGTCCGAGCC GGAGGACCTC ACCGTCGCCG AGATCGTCCG TCGGCGCACC GCGGCGC'IGC 
71101 TCGGCGTCAC GGACCCCGCC GACGTCGATG CGGAAGCGAC GTTCTTCGCG CTCGGTTTCG 
71161 ACTCACTGGC GGTGCAGCGG CTGCGCAACC AGCTCGCCTC GGCAACCGGG CTGGACCTGC 
71221 CGGCGGCCGT CCTGTTCGAC CACGACACCC CGGCCGCGCT CACCGCGTTC CTCCAGGACC 
71281 GGATCGAGGC CGGCCAGGAC CGGATCGAGG CCGGCGAGGA CGACGACGCG CCCACCGTGC 
71341 TCTCGCTCCT GGAGGAGATG GAGTCGCTCG ACGCCGCGGA CATCGCGGCG ACGCCGGCCC 
714 01 CGGAGCGTGC GGCCATCGCC GATCTGCTCG ACAAGCTCGC CCATACCTGG AAGGACTACC 
714 61 GATGAGCACC GATACGCACG AGGGAACGCC GCCCGCCGGC CGCTGCCCAT TCGCGATCCA 
71521 GGACGGTCAC CGCGCCATCC TGGAGAGCGG CACGGTGGGT TCGTTCGACC TGTTCGGCGT 
71581 CAAGCACTGG CTGGTCGCCG CCGCCGAGGA CGTCAAGCTG GTCACCAACG ATCCGCGGTT 
71641 CAGCTCGGCC GCGCCGTCCG AGATGCTGCC CGACCGGCGG CCCGGCTGGT TCTCCGGGAT 
717 01 GGACTCACCG GAGCACAACC GCTACCGGCA GAAGATCGCG GGGGACTTCA CACTGCGCGC 

717 61 GGCGCGCAAG CGGGAGGACT TCGTCGCCGA GGCCGCCGAC GCCTGCCTGG ACGACATCGA 
71821 GGCCGCGGGA CCCGGCACCG ACCTCATCCC CGGGTACGCC AAGCGGCTGC CCTCCCTCGT 

718 81 CATCAACGCG CTGTACGGGC TCACCCCTGA GGAGGGGGCC GTGCTGGAGG CACGGATGCG 
71941 CGACATCACC GGCTCGGCCG ATCTGGACAG CGTCAAGACG CTGACCGACG ACTTCTTCGG 
7 2001 GCACGCGCTG CGGCTGGTCC GCGCGAAGCG TGACGAGCGG GGCGAGGACC TGCTGCACCG 
72061 GCTGGCCTCG GCCGACGACG GCGAGATCTC GCTCAGCGAC GACGAGGCGA CGGGCGTGTT 
72121 CGCGACGCTG CTGTTCGCCG GCCACGACTC GGTGCAGCAG ATGGTCGGCT ACTGCCTCTA 
72181 CGCACTGCTC AGCCACCCCG AGCAGCAGGC GGCGCTGCGC GCGCGCCCGG AGCTGGTCGA 
72241 CAACGCGGTC GAG GAG AT GC TCCGTTTCCT GCCCGTCAAC CAGATGGGCG TACCGCGCGT 
72301 CTGTGTCGAG GACGTCGATG TGCGGGGCGT GCGCATCCGT GCGGGCGACA ACGTGATCCC 
7 2361 GCTCTACTCG ACGGCCAACC GCGACCCCGA GGTGTTCCCG CAGCCCGACA CCTTCGATGT 
72421 GACGCGCCCG CTGGAGGGCA ACTTCGCGTT CGGCCACGGC ATTCACAAGT GTCCCGGCCA 
724 81 GCACATCGCC CGGGTGCTCA TCAAGGTCGC CTGCCTGCGG TTGTTCGAGC GTTTCCCGGA 
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72541 CGTCCGGCTG GCCGGCGACG TGCCGATGAA CGAGGGGCTC GGGCTGTTCA GCCCGGCCGA 
72 601 GCTGCGGGTC ACCTGGGGGG CGGCATGAGT CACCCGGTGG AGACGTTGCG GTTGCCGAAC 
72 661 GGGACGACGG TCGCGCACAT CAACGCGGGC GAGGCGCAGT TCCTCTACCG GGAGATCTTC 
72721 ACCCAGCGCT GCTACCTGCG CCACGGTGTC GACCTGCGCC CGGGGGACGT GGTGTTCGAC 
5 72781 GTCGGCGCGA ACATCGGCAT GTTCACGCTT TTCGCGCATC TGGAGTGTCC TGGTGTGACC 
72841 GTGCACGCCT TCGAGCCCGC GCCCGTGCCG TTCGCGGCGC TGCGGGCGAA CGTGACGCGG 
72901 CACGGCATCC CGGGCCAGGC GGACCAGTGC GCGGTCTCCG ACAGCTCCGG CACCCGGAAG 
72 961 ATGACCTTCT ATCCCGACGC CACGCTGATG TCCGGTTTCC ACGCGGATGC CGCGGCCCGG 
73021 ACGGAGCTGT TGCGCACGCT CGGCCTCAAC GGCGGCTACA CCGCCGAGGA CGTCGACACC 
10 73081 ATGCTCGCGC AACTGCCCGA CGTCAGCGAG GAGATCGAAA CCCCTGTGGT CCGGCTCTCC 
73141 GACGTCATCG CGGAGCGCGG TATCGAGGCC ATCGGCCTGC TGAAGGTCGA CGTGGAGAAG 
73201 AGCGAACGGC AGGTCTTCGC CGGCCTCGAG GACACCGACT GGCCCCGTAT CCGCCAGGTC 
7 3261 GTCGCGGAGG TCCACGACAT CGACGGCGCG CTCGAGGAGG TCGTCACGCT GCTCCGCGGC 
73321 CATGGCTTCA CCGTGGTCGC CGAGCAGGAA CCGCTGTTCG CCGGCACGGG CATCCACCAG 
15 73381 GTCGCCGCGC GGCGGGTGGC CGGCTGAGCG CCGTCGGGGC CGCGGCCGTC CGCACCGGCG 
7 3441 GCCGCGGTGC GGACGGCGGC TCAGCCGGCG TCGGACAGTT CCTTGGGCAG TTGCTGACGG 
73501 CCCTTCACCC CCAGCTTGCG GAACACGTTG GTGAGGTGCT GTTCCACCGT GCTGGAGGTG 
73561 ACGAACAGCT GGCTGGCGAT CTCCTTGTTG GTGCGCCCGA CCGCGGCGTG CGACGCCACC 
73621 CGCCGCTCCG CCTCGGTCAG CGATGTGATC CGCTGCGCCG GCGTCACGTC CTGGGTGCCG 
20 73681 TCCGCGTCCG AGGACTCCCC ACCGAGCCGC CGGAGGAGCG GCACGGCTCC GCACTGGGTC 
73741 GCGAGGTGCC GTGCGCGGCG GAACAGTCCC CGCGCACGGC TGTGCCGCCG GAGCATGCCG 
73801 CACGCTTCGC CCATGTCGGC GAGGACGCGG GCCAGCTCGT ACTGGTCGCG GCACATGATG 
73861 AGCAGATCGG CGGCCTCGTC GAGCAGTTCG ATCCGCTTGG CCGGCGGACT GTAGGCCGCC 
73921 TGCACCCGCA GCGTCATCAC CCGCGCCCGG GACCCCATCG GCCGGGACAG CTGCTCGGAG 
25 73981 ATGAGCCTCA GCCCCTCGTC ACGGCCGCGG CCGAGCAGCA GAAGCGCTTC GGCGGCGTCG 
74 041 ACCCGCCACA GGGCCAGGCC CGGCACGTCG ACGGACCAGC GTCGCATCCG CTCCCCGCAG 
74101 TCCCGGAACG CGTTGTACGC CGCCCGGTAC CGCCCGGCCG CGAGATGGTG TTGCCCACGG 
74161 GCCCAGACCA TGTGCAGTCC GAAGAGGCTG TCGGAGGTCT CCTCCGGCAA CGGCTCGGCG 
74 221 AGCCACCGCT CCGCCCGGTC CAGGTCGCCC AGTCGGATCG CGGCGGCCAC GGTGCTGCTC 
30 7 4 281 AGCGGCAATG CGGCGGCCAT CCCCCAGGAG GGCACGACCC GGGGGGCGAG CGCGGCCTCG 
74 341 CCGCATTCGA CGGCGGCGGT CAGGTCGCCG CGGCGCAGCG CGGCCTCGGC GCGGAACCCC 
74401 GCGTGGACCG CCTCGTCGGC CGGGGTCCGC ATGTTGTCGT CACCGGCCAG CTTGTCGACC 
74 4 61 CAGGACTGGA CGGCATCGGT GTCCTCGGCG TAGAGCAGGG CCAGCAACGC CATCATGGTC 
74521 GTGGTCCGGT CCGTCGTGAC CCGGGAGTGC TGGAGCACGT ACTCGGCTTT GGCCTCGGCC 
35 74581 TGTTCGGACC AGCCGCGCAG CGCGTTGCTC AGGGCCTTGT CGGCGACGGC GCGGTGCCGG 
74 641 ACGGCTCCGG AAAACGAGGC GACCTCGTCC TCGGCCGGCG GATCGGCCGG ACGCGGCGGA 
74701 TCGGCCGCGC CGGGATAGAT CAGCGCGAGG GACAGGTCCG CGACGCGCAG GTGCGCCCGG 
747 61 CCCTGCTCGC TCGGGGCGGC GGAGCGCTGG GCCGCCAGGA CCTCGGCGGC CTCGCCCGGC 
7 4 821 CGCCCGTCCA TCGCCAGCCA GCAGGCGAGC GACACGGCGT GCTCGCTGGA GAGGAGCCGT 
40 7 4 881 TCCCGCGACG CGGTGAGCAG CTCGGGCACA TGCCGGCCGG ATCTGGCGGG ATCGCAGAGC 
7 4 941 CGCTCGATGG CGGCGGTGTC GACGCGCAGT GCGGCGTGGA CGGCGGGGTC GTCGGAGGCC 
7 5001 CGGTAGGCGA ACTCCAGGTA GGTGACGGCC TCGTCGAGCT CGCCGCGCAG GTGGTGCTCG 
7 5061 CGCGCGGCGT CGGTGAACAG CCCGGCGACC TCGGCGCCGT GCACCCGGCC GGTACCCATC 
7 5121 TGGTGGCGGG CGAGCACCTT GCTGGCCACG CCGCGGTCCC GCAGCAGTTC CAGCGCCAGC 
45 7 5181 TCGTGCAGGC CACGCCGCTC GGCGGCGGAG AGGTCGTCGA GTACGACGGA GCGGGCCGCG 
75241 GGGTGCGGGA ACCGCCCTTC CCGCAGCAGC CGCCCCTCGA CCAGCTGTTC GTGGGCCTGC 
75301 TCGACCGCCT CGGTGTCGAG GCCGGTCATC CGCTGGACGA GGGTGAGTTC GACACTCTCG 
75361 CCGAGCACGG CGGAAGCTCG GGCGACGCTC AGCGCGGCCG GGCCGCAACG ATAGAGCGAC 
7 5421 CCGAGGTAGG CGAGCCGGTA CGCCCGCCCC GCGACCACTT CCAGGCACCC TGAGGTCCGT 
50 7 5481 GTCCGTGCCT CCCGGATGTC GTCGATCAGG CCGTGGCCGA GGAGCAGGTT GCCGCCGGTC 
7 5541 GCCCGGAACG CCTGGGCCAC CACGTCGTCG TGCGCGTCCT GGCCGAGGTG CCGGCGCACG 
7 5601 AGTTCGGTGG TCTGCGCCTC GGTGAGCGGG CGCAGCGCGA TCTCCTGGTA GTGGCGCAGA 
75661 CTCAGCAGTG CCGCCCGGAA TTGGGAGTGG GCGGGCGTCG GCCGGAGCAG CTCGGTCAGC 
7 5721 ACGATGGCGA CACGGGCCCG GCTGATGCGG CGCGCGAGGT GGAGCAGGCA GCGCAGCGAC 
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757 81 GGCGCGTCGG CGTGGTGCAC GTCGTCGATG CCGATCAGTA CGGGCCGCTC CGCGGCGAGC 
75841 GTCAGCACCG TGCGGGTGAG TTCGGTCCCC AGGCGGTTGT CGACGTCGGC CGGCAGGTTT 
75901 TCGCACGATG CCGTCAGCCG GACCAGCTCC GGTGTCCGGG CGGCCAGCTC GGGCTGGTCG 
75961 AGGAGCTGGC CGAGCATGCC GTACGGCAGG GCCCGCTCCT CCATGGAGCA CACCGCGCGA 
5 7 6021 AGGGTGACGA AGCCGGCCTT GGCCGCGGCG GCGTCGAGGA GTTCGGTCTT GCCGCAGGCG 
7 6081 ATCGGCCCGG TGACGGCGGC GACGACGCCC CGCCCGCCCC CCGCTCGGGT GAGCGCCCGG 
76141 TGGAGGGAAC CGAACTCGTC ATCGCGGGCG ATCAGGTCTG GGGGAGATAA GCGCGCTATC 
7 6201 ACGAATGGAA CTACCTCGCG ACCGTCGTGG AAACCCATAG GCATCACATG GCTTGTTGAT 
76261 CTGTACGGCT GTGATTCAGC CTGGCGGGAT GCTGTGCTAC AGATGGGAAG ATGTGATCTA 

10 76321 GGGCCGTGCC GTTCCCTCAG GAGCCGACCG CCCCCGGCGC CACCCGCCGT ACCCCCTGGG 
7 6381 CCACCAGCTC GGCGACCCGC TCCTGGTGGT CGACGAGGTA GAAGTGCCCG CCGGGGAAGA 
7 6441 CCTCCACCGT GGTCGGCGCG GTCGTGTGCC CGGCCCAGGC GTGGGCCTGC TCCACCGTCG 
7 6501 TCTTCGGATC GTCGTCACCG ATGCACACCG TGATCGGCGT CTCCAGCGGC GGCGCGGGCT 
7 6561 CCCACCGGTA CGTCTCCGCC GCGTAGTAGT CCGCCCGCAA CGGCGCCAGG ATCAGCGCGC 

15 76621 GCATTTCGTC GTCCGCCATC ACATCGGCGC TCGTCCCGCC GAGGCCGATG ACCGCCGCCA 
7 6681 GCAGCTCGTC GTCGGACGCG AGGTGGTCCT GGTCGGCGCG CGGCTGCGAC GGCGCCCGCC 
7 6741 GGCCCGAGAC GATCAGGTGC GCCACCGGGA GCCGCTGGGC CAGCTCGAAC GCGAGTGTCG 
7 6801 CGCCCATGCT GTGGCCGAAC AGCACCAGCG GACGGTCCAG CCCCGGCTTC AACGCCTCGG 
7 68 61 CCACGAGGCC GGCGAGAACA CGCAGGTCGC GCACCGCCTC CTCGTCGCGG CGGTCCTGGC 

20 7 6921 GGCCGGGGTA CTGCACGGCG TACACGTCCG CCACCGGGGC GAGCGCACGG GCCAGCGGAA 
76981 GGTAGAACGT CGCCGATCCG CCGGCGTGGG GCAGCAGCAC CACCCGTACC GGGGCCTCGG 
77041 GCGTGGGGAA GAACTGCCGC AGCCAGAGTT CCGAGCTCAC CGCACCCCCT CGGCCGCGAC 
77101 CTGGGGAGCC CGGAACCGGG TGATCTCGGC CAAGTGCTTC TCCCGCATCT CCGGGTCGGT 
77161 CACGCCCCAT CCCTCCTCCG GCGCCAGACA GAGGACGCCG ACTTTGCCGT TGTGCACATT 

25 77221 GCGATGCACA TCGCGCACCG CCGACCCGAC GTCGTCGAGC GGGTAGGTCA CCGACAGCGT 
77281 CGGGTGCACC ATCCCCTTGC AGATCAGGCG GTTCGCCTCC CACGCCTCAC GATAGTTCGC 
77341 GAAGTGGGTA CCGATGATCC GCTTCACGGA CATCCACAGG TACCGATTGT CAAAGGCGTG 
77 401 CTCGTATCCC GAGGTTGACG CGCAGGTGAC GATCGTGCCA CCCCGACGTG TCACGTAGAC 
77 4 61 ACTCGCGCCG AACGTCGCGC GCCCCGGGTG CTCGAACACG ATGTCGGGAT CGTCACCGCC 

30 77 521 GGTCAGCTCC CGGATC 



Those of skill in the art will recognize that, due to the degenerate nature of the 
genetic code, a variety of DNA compounds differing in their nucleotide sequences can be 
used to encode a given amino acid sequence of the invention. The native DNA sequence 

35 encoding the FK-520 PKS of Streptomyces hygroscopicus is shown herein merely to 
illustrate a preferred embodiment of the invention, and the present invention includes 
DNA compounds of any sequence that encode the amino acid sequences of the 
polypeptides and proteins of the invention. In similar fashion, a polypeptide can typically 
tolerate one or more amino acid substitutions, deletions, and insertions in its amino acid 

40 sequence without loss or significant loss of a desired activity. The present invention 
includes such polypeptides with alternate amino acid sequences, and the amino acid 
sequences shown merely illustrate preferred embodiments of the invention. 
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The recombinant nucleic acids, proteins, and peptides of the invention are many 
and diverse. To facilitate an understanding of the invention and the diverse compounds 
and methods provided thereby, the following general description of the FK-520 PKS 
genes and modules of the PKS proteins encoded thereby is provided. This general 

5 description is followed by a more detailed description of the various domains and 
modules of the FK-520 PKS contained in and encoded by the compounds of the 
invention. In this description, reference to a heterologous PKS refers to any PKS other 
than the FK-520 PKS. Unless otherwise indicated, reference to a PKS includes reference 
to a portion of a PKS. Moreover, reference to a domain, module, or PKS includes 

10 reference to the nucleic acids encoding the same and vice-versa, because the methods and 
reagents of the invention provide or enable one to prepare proteins and the nucleic acids 
that encode them. 

The FK-520 PKS is composed of three proteins encoded by three genes 
designated fkbA,flcbB, mdflcbC. The flcbA ORF encodes extender modules 7 - 10 of the 
1 5 PKS. The JkbB ORF encodes the loading module (the CoA ligase) and extender modules 
1 - 4 of the PKS. The fkbC ORF encodes extender modules 5 - 6 of the PKS. The fkbP 
ORF encodes the NRPS that attaches the pipecolic acid and cyclizes the FK-520 
polyketide. 

The loading module of the FK-520 PKS includes a CoA ligase, an ER domain, 
20 and an ACP domain. The starter building block or unit for FK-520 is believed to be a 
dihydroxycyclohexene carboxylic acid, which is derived from shikimate. The 
recombinant DNA compounds of the invention that encode the loading module of the 
FK-520 PKS and the corresponding polypeptides encoded thereby are useful for a variety 
of methods and in a variety of compounds. In one embodiment, a DNA compound 
25 comprising a sequence that encodes the FK-520 loading module is inserted into a DNA 
compound that comprises the coding sequence for a heterologous PKS. The resulting 
construct, in which the coding sequence for the loading module of the heterologous PKS 
is replaced by the coding sequence for the FK-520 loading module, provides a novel PKS 
coding sequence. Examples of heterologous PKS coding sequences include the 
30 rapamycin, FK-506, rifamycin, and avermectin PKS coding sequences. In another 
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embodiment, a DNA compound comprising a sequence that encodes the FK-520 loading 
module is inserted into a DNA compound that comprises the coding sequence for the FK- 
520 PKS or a recombinant FK-520 PKS that produces an FK-520 derivative. 

In another embodiment, a portion of the loading module coding sequence is 
5 utilized in conjunction with a heterologous coding sequence. In this embodiment, the 
invention provides, for example, either replacing the CoA ligase with a different CoA 
ligase, deleting the ER, or replacing the ER with a different ER. In addition, or 
alternatively, the ACP can be replaced by another ACP. In similar fashion, the 
corresponding domains in another loading or extender module can be replaced by one or 

10 more domains of the FK-520 PKS. The resulting heterologous loading module coding 
sequence can be utilized in conjunction with a coding sequence for a PKS that 
synthesizes FK-520, an FK-520 derivative, or another polyketide. 

The first extender module of the FK-520 PKS includes a KS domain, an AT 
domain specific for methylmalonyl CoA, a DH domain, a KR domain, and an ACP 

1 5 domain. The recombinant DNA compounds of the invention that encode the first 
extender module of the FK-520 PKS and the corresponding polypeptides encoded 
thereby are useful for a variety of applications. In one embodiment, a DNA compound 
comprising a sequence that encodes the FK-520 first extender module is inserted into a 
DNA compound that comprises the coding sequence for a heterologous PKS. The 

20 resulting construct, in which the coding sequence for a module of the heterologous PKS 
is either replaced by that for the first extender module of the FK-520 PKS or the latter is 
merely added to coding sequences for modules of the heterologous PKS, provides a novel 
PKS coding sequence. In another embodiment, a DNA compound comprising a sequence 
that encodes the first extender module of the FK-520 PKS is inserted into a DNA 

25 compound that comprises the remainder of the coding sequence for the FK-520 PKS or a 
recombinant FK-520 PKS that produces an FK-520 derivative. 

In another embodiment, all or only a portion of the first extender module coding 
sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 

30 methylmalonyl CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2- 
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hydroxymalonyl CoA specific AT; deleting either the DH or KR or both; replacing the 
DH or KR or both with another DH or KR; and/or inserting an ER. In replacing or 
inserting KR, DH, and ER domains, it is often beneficial to replace the existing KR, DH, 
and ER domains with the complete set of domains desired from another module. Thus, if 
5 one desires to insert an ER domain, one may simply replace the existing KR and DH 
domains with a KR, DH, and ER set of domains from a module containing such domains. 
In addition, the KS and/or ACP can be replaced with another KS and/or ACP. In each of 
these replacements or insertions, the heterologous KS, AT, DH, KR, ER, or ACP coding 
sequence can originate from a coding sequence for another module of the FK-520 PKS, 

1 0 from a gene for a PKS that produces a polyketide other than FK-520, or from chemical 
synthesis. The resulting heterologous first extender module coding sequence can be 
utilized in conjunction with a coding sequence for a PKS that synthesizes FK-520, an 
FK-520 derivative, or another polyketide. In similar fashion, the corresponding domains 
in a module of a heterologous PKS can be replaced by one or more domains of the first 

1 5 extender module of the FK-520 PKS. 

In an illustrative embodiment of this aspect of the invention, the invention 
provides recombinant PKSs and recombinant DNA compounds and vectors that encode 
such PKSs in which the KS domain of the first extender module has been inactivated. 
Such constructs are especially useful when placed in translational reading frame with the 

20 remaining modules and domains of an FK-520 or FK-520 derivative PKS. The utility of 
these constructs is that host cells expressing, or cell free extracts containing, the PKS 
encoded thereby can be fed or supplied with N-acylcysteamine thioesters of novel 
precursor molecules to prepare FK-520 derivatives. See U.S. patent application Serial 
No. 60/1 17,384, filed 27 Jan. 1999, and PCT patent publication Nos. US97/02358 and 

25 US99/03986, each of which is incorporated herein by reference. 

The second extender module of the FK-520 PKS includes a KS, an AT specific 
for methylmalonyl CoA, a KR, an inactive DH, and an ACP. The recombinant DNA 
compounds of the invention that encode the second extender module of the FK-520 PKS 
and the corresponding polypeptides encoded thereby are useful for a variety of 

30 applications. In one embodiment, a DNA compound comprising a sequence that encodes 
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the FK-520 second extender module is inserted into a DNA compound that comprises the 
coding sequence for a heterologous PKS. The resulting construct, in which the coding 
sequence for a module of the heterologous PKS is either replaced by that for the second 
extender module of the FK-520 PKS or the latter is merely added to coding sequences for 
the modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the second extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 
sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

In another embodiment, all or a portion of the second extender module coding 
sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 
methylmalonyl CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2- 
hydroxymalonyl CoA specific AT; deleting the KR and/or the inactive DH; replacing the 
KR with another KR; and/or inserting an active DH or an active DH and an ER. In 
addition, the KS and/or ACP can be replaced with another KS and/or ACP. In each of 
these replacements or insertions, the heterologous KS, AT, DH, KR, ER, or ACP coding 
sequence can originate from a coding sequence for another module of the FK-520 PKS, 
from a coding sequence for a PKS that produces a polyketide other than FK-520, or from 
chemical synthesis. The resulting heterologous second extender module coding sequence 
can be utilized in conjunction with a coding sequence from a PKS that synthesizes FK- 
520, an FK-520 derivative, or another polyketide. In similar fashion, the corresponding 
domains in a module of a heterologous PKS can be replaced by one or more domains of 
the second extender module of the FK-520 PKS. 

The third extender module of the FK-520 PKS includes a KS, an AT specific for 
malonyl CoA, a KR, an inactive DH, and an ACP. The recombinant DNA compounds of 
the invention that encode the third extender module of the FK-520 PKS and the 
corresponding polypeptides encoded thereby are useful for a variety of applications. In 
one embodiment, a DNA compound comprising a sequence that encodes the FK-520 
third extender module is inserted into a DNA compound that comprises the coding 
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sequence for a heterologous PKS. The resulting construct, in which the coding sequence 
for a module of the heterologous PKS is either replaced by that for the third extender 
module of the FK-520 PKS or the latter is merely added to coding sequences for the 
modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
5 embodiment, a DNA compound comprising a sequence that encodes the third extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 
sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

In another embodiment, all or a portion of the third extender module coding 

10 sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 
malonyl CoA specific AT with a methylmalonyl CoA, ethylmalonyl CoA, or 2- 
hydroxymalonyl CoA specific AT; deleting the KR and/or the inactive DH; replacing the 
KR with another KR; and/or inserting an active DH or an active DH and an ER. In 

1 5 addition, the KS and/or ACP can be replaced with another KS and/or ACP. In each of 
these replacements or insertions, the heterologous KS, AT, DH, KR, ER, or ACP coding 
sequence can originate from a coding sequence for another module of the FK-520 PKS, 
from a coding sequence for a PKS that produces a polyketide other than FK-520, or from 
chemical synthesis. The resulting heterologous third extender module coding sequence 

20 can be utilized in conjunction with a coding sequence from a PKS that synthesizes FK- 
520, an FK-520 derivative, or another polyketide. In similar fashion, the corresponding 
domains in a module of a heterologous PKS can be replaced by one or more domains of 
the third extender module of the FK-520 PKS. 

The fourth extender module of the FK-520 PKS includes a KS, an AT that binds 

25 ethylmalonyl CoA, an inactive DH, and an ACP. The recombinant DNA compounds of 
the invention that encode the fourth extender module of the FK-520 PKS and the 
corresponding polypeptides encoded thereby are useful for a variety of applications. In 
one embodiment, a DNA compound comprising a sequence that encodes the FK-520 
fourth extender module is inserted into a DNA compound that comprises the coding 

30 sequence for a heterologous PKS. The resulting construct, in which the coding sequence 
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for a module of the heterologous PKS is either replaced by that for the fourth extender 
module of the FK-520 PKS or the latter is merely added to coding sequences for the 
modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the fourth extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the 
remainder of the coding sequence for the FK-520 PKS or a recombinant FK-520 PKS 
that produces an FK-520 derivative. 

In another embodiment, a portion of the fourth extender module coding sequence 
is utilized in conjunction with other PKS coding sequences to create a hybrid module. In 
this embodiment, the invention provides, for example, either replacing the ethylmalonyl 
CoA specific AT with a malonyi CoA, methylmalonyl CoA, or 2-hydroxymalonyl CoA 
specific AT; and/or deleting the inactive DH, inserting a KR, a KR and an active DH, or a 
KR, an active DH, and an ER. In addition, the KS and/or ACP can be replaced with 
another KS and/or ACP. In each of these replacements or insertions, the heterologous KS, 
AT, DH, KR, ER, or ACP coding sequence can originate from a coding sequence for 
another module of the FK-520 PKS, a PKS for a polyketide other than FK-520, or from 
chemical synthesis. The resulting heterologous fourth extender module coding sequence 
can be utilized in conjunction with a coding sequence for a PKS that synthesizes FK-520, 
an FK-520 derivative, or another polyketide. In similar fashion, the corresponding 
domains in a module of a heterologous PKS can be replaced by one or more domains of 
the fourth extender module of the FK-520 PKS. 

As illustrative examples, the present invention provides recombinant genes, 
vectors, and host cells that result from the conversion of the FK-506 PKS to an FK-520 
PKS and vice- versa. In one embodiment, the invention provides a recombinant set of FK- 
506 PKS genes but in which the coding sequences for the fourth extender module or at 
least those for the AT domain in the fourth extender module have been replaced by those 
for the AT domain of the fourth extender module of the FK-520 PKS. This recombinant 
PKS can be used to produce FK-520 in recombinant host cells. In another embodiment, 
the invention provides a recombinant set of FK-520 PKS genes but in which the coding 
sequences for the fourth extender module or at least those for the AT domain in the fourth 
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extender module have been replaced by those for the AT domain of the fourth extender 
module of the FK-506 PKS. This recombinant PKS can be used to produce FK-506 in 
recombinant host cells. 

Other examples of hybrid PKS enzymes of the invention include those in which 
the AT domain of module 4 has been replaced with a malonyl specific AT domain to 
provide a PKS that produces 21-desethyl-FK520 or with a methylmalonyl specific AT 
domain to provide a PKS that produces 21 -desethyl-21-methyl-FK520. Another hybrid 
PKS of the invention is prepared by replacing the AT and inactive KR domain of FK-520 
extender module 4 with a methylmalonyl specific AT and an active KR domain, such as, 
for example, from module 2 of the DEBS or oleandolide PKS enzymes, to produce 21- 
desethyl-21-methyl-22-desoxo-22-hydroxy-FK520. The compounds produced by these 
hybrid PKS enzymes are neurotrophins. 

The fifth extender module of the FK-520 PKS includes a KS, an AT that binds 
methylmalonyl CoA, a DH, a KR, and an ACP. The recombinant DNA compounds of the 
invention that encode the fifth extender module of the FK-520 PKS and the 
corresponding polypeptides encoded thereby are useful for a variety of applications. In 
one embodiment, a DNA compound comprising a sequence that encodes the FK-520 fifth 
extender module is inserted into a DNA compound that comprises the coding sequence 
for a heterologous PKS. The resulting construct, in which the coding sequence for a 
module of the heterologous PKS is either replaced by that for the fifth extender module 
of the FK-520 PKS or the latter is merely added to coding sequences for the modules of 
the heterologous PKS, provides a novel PKS. In another embodiment, a DNA compound 
comprising a sequence that encodes the fifth extender module of the FK-520 PKS is 
inserted into a DNA compound that comprises the coding sequence for the FK-520 PKS 
or a recombinant FK-520 PKS that produces an FK-520 derivative. 

In another embodiment, a portion of the fifth extender module coding sequence is 
utilized in conjunction with other PKS coding sequences to create a hybrid module. In 
this embodiment, the invention provides, for example, either replacing the methylmalonyl 
CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2-hydroxymalonyl CoA 
specific AT; deleting any one or both of the DH and KR; replacing any one or both of the 
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DH and KR with either a KR and/or DH; and/or inserting an ER. In addition, the KS 
and/or ACP can be replaced with another KS and/or ACP. In each of these replacements 
or insertions, the heterologous KS, AT, DH, KR, ER, or ACP coding sequence can 
originate from a coding sequence for another module of the FK-520 PKS, from a coding 
sequence for a PKS that produces a polyketide other than FK-520, or from chemical 
synthesis. The resulting heterologous fifth extender module coding sequence can be 
utilized in conjunction with a coding sequence for a PKS that synthesizes FK-520, an 
FK-520 derivative, or another polyketide. In similar fashion, the corresponding domains 
in a module of a heterologous PKS can be replaced by one or more domains of the fifth 
extender module of the FK-520 PKS. 

In an illustrative embodiment, the present invention provides a set of recombinant 
FK-520 PKS genes in which the coding sequences for the DH domain of the fifth 
extender module have been deleted or mutated to render the DH non-functional. In one 
such mutated gene, the KR and DH coding sequences are replaced with those encoding 
only a KR domain from another PKS gene. The resulting PKS genes code for the 
expression of an FK-520 PKS that produces an FK-520 analog that lacks the C-19 to C- 
20 double bond of FK-520 and has a C-20 hydroxyl group. Such analogs are preferred 
neurotrophins, because they have little or no immunosuppressant activity. This 
recombinant fifth extender module coding sequence can be combined with other coding 
sequences to make additional compounds of the invention. In an illustrative embodiment, 
the present invention provides a recombinant FK-520 PKS that contains both this fifth 
extender module and the recombinant fourth extender module described above that 
comprises the coding sequence for the fourth extender module AT domain of the FK-506 
PKS. The invention also provides recombinant host cells derived from FK-506 producing 
host cells that have been mutated to prevent production of FK-506 but that express this 
recombinant PKS and so synthesize the corresponding (lacking the C-19 to C-20 double 
bond of FK-506 and having a C-20 hydroxyl group) FK-506 derivative. In another 
embodiment, the present invention provides a recombinant FK-506 PKS in which the DH 
domain of module 5 has been deleted or otherwise rendered inactive and thus produces 
this novel polyketide. 
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The sixth extender module of the FK-520 PKS includes a KS, an AT specific for 
methylmalonyl CoA, a KR, a DH, an ER, and an ACP. The recombinant DNA 
compounds of the invention that encode the sixth extender module of the FK-520 PKS 
and the corresponding polypeptides encoded thereby are useful for a variety of 
5 applications. In one embodiment, a DNA compound comprising a sequence that encodes 
the FK-520 sixth extender module is inserted into a DNA compound that comprises the 
coding sequence for a heterologous PKS. The resulting construct, in which the coding 
sequence for a module of the heterologous PKS is either replaced by that for the sixth 
extender module of the FK-520 PKS or the latter is merely added to coding sequences for 

10 the modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the sixth extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 
sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

15 In another embodiment, a portion of the sixth extender module coding sequence is 

utilized in conjunction with other PKS coding sequences to create a hybrid module. In 
this embodiment, the invention provides, for example, either replacing the methylmalonyl 
CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2-hydroxymalonyl CoA 
specific AT; deleting any one, two, or all three of the KR, DH, and ER; and/or replacing 

20 any one, two, or all three of the KR, DH, and ER with another KR, DH, and ER. In 
addition, the KS and/or ACP can be replaced with another KS and/or ACP. In each of 
these replacements, the heterologous KS, AT, DH, KR, ER, or ACP coding sequence can 
originate from a coding sequence for another module of the FK-520 PKS, from a coding 
sequence for a PKS that produces a polyketide other than FK-520, or from chemical 

25 synthesis. The resulting heterologous sixth extender module coding sequence can be 
utilized in conjunction with a coding sequence for a PKS that synthesizes FK-520, an 
FK-520 derivative, or another polyketide. In similar fashion, the corresponding domains 
in a module of a heterologous PKS can be replaced by one or more domains of the sixth 
extender module of the FK-520 PKS. 
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In an illustrative embodiment, the present invention provides a set of recombinant 
FK-520 PKS genes in which the coding sequences for the DH and ER domains of the 
sixth extender module have been deleted or mutated to render them non-functional. In 
one such mutated gene, the KR, ER, and DH coding sequences are replaced with those 
encoding only a KR domain from another PKS gene. This can also be accomplished by 
simply replacing the coding sequences for extender module six with those for an extender 
module having a methylmalonyl specific AT and only a KR domain from a heterologous 
PKS gene, such as, for example, the coding sequences for extender module two encoded 
by the eryAl gene. The resulting PKS genes code for the expression of an FK-520 PKS 
that produces an FK-520 analog that has a C-18 hydroxyl group. Such analogs are 
preferred neurotrophic, because they have little or no immunosuppressant activity. This 
recombinant sixth extender module coding sequence can be combined with other coding 
sequences to make additional compounds of the invention. In an illustrative embodiment, 
the present invention provides a recombinant FK-520 PKS that contains both this sixth 
extender module and the recombinant fourth extender module described above that 
comprises the coding sequence for the fourth extender module AT domain of the FK-506 
PKS. The invention also provides recombinant host cells derived from FK-506 producing 
host cells that have been mutated to prevent production of FK-506 but that express this 
recombinant PKS and so synthesize the corresponding (having a C-18 hydroxyl group) 
FK-506 derivative. In another embodiment, the present invention provides a recombinant 
FK-506 PKS in which the DH and ER domains of module 6 have been deleted or 
otherwise rendered inactive and thus produces this novel polyketide. 

The seventh extender module of the FK-520 PKS includes a KS, an AT specific 
for 2-hydroxymalonyl CoA, a KR, a DH, an ER, and an ACP. The recombinant DNA 
compounds of the invention that encode the seventh extender module of the FK-520 PKS 
and the corresponding polypeptides encoded thereby are useful for a variety of 
applications. In one embodiment, a DNA compound comprising a sequence that encodes 
the FK-520 seventh extender module is inserted into a DNA compound that comprises 
the coding sequence for a heterologous PKS. The resulting construct, in which the coding 
sequence for a module of the heterologous PKS is either replaced by that for the seventh 
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extender module of the FK-520 PKS or the latter is merely added to coding sequences for 
the modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the seventh extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 
sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

In another embodiment, a portion or all of the seventh extender module coding 
sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 2- 
hydroxymalonyl CoA specific AT with a methylmalonyl CoA, ethylmalonyl CoA, or 
malonyl CoA specific AT; deleting the KR, the DH, and/or the ER; and/or replacing the 
KR, DH, and/or ER. In addition, the KS and/or ACP can be replaced with another KS 
and/or ACP. In each of these replacements or insertions, the heterologous KS, AT, DH, 
KR, ER, or ACP coding sequence can originate from a coding sequence for another 
module of the FK-520 PKS, from a coding sequence for a PKS that produces a polyketide 
other than FK-520, or from chemical synthesis. The resulting heterologous seventh 
extender module coding sequence can be utilized in conjunction with a coding sequence 
for a PKS that synthesizes FK-520, an FK-520 derivative, or another polyketide. In 
similar fashion, the corresponding domains in a module of a heterologous PKS can be 
replaced by one or more domains of the seventh extender module of the FK-520 PKS. 

In an illustrative embodiment, the present invention provides a set of recombinant 
FK-520 PKS genes in which the coding sequences for the AT domain of the seventh 
extender module has been replaced with those encoding an AT domain for malonyl, 
methylmalonyl, or ethylmalonyl CoA from another PKS gene. The resulting PKS genes 
code for the expression of an FK-520 PKS that produces an FK-520 analog that lacks the 
C-15 methoxy group, having instead a hydrogen, methyl, or ethyl group at that position, 
respectively. Such analogs are preferred, because they are more slowly metabolized than 
FK-520. This recombinant seventh extender module coding sequence can be combined 
with other coding sequences to make additional compounds of the invention. In an 
illustrative embodiment, the present invention provides a recombinant FK-520 PKS that 
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contains both this seventh extender module and the recombinant fourth extender module 
described above that comprises the coding sequence for the fourth extender module AT 
domain of the FK-506 PKS. The invention also provides recombinant host cells derived 
from FK-506 producing host cells that have been mutated to prevent production of FK- 
5 506 but that express this recombinant PKS and so synthesize the corresponding (C-l 5- 
desmethoxy) FK-506 derivative. In another embodiment, the present invention provides a 
recombinant FK-506 PKS in which the AT domain of module 7 has been replaced and 
thus produces this novel polyketide. 

In another illustrative embodiment, the present invention provides a hybrid PKS 

10 in which the AT and KR domains of module 7 of the FK-520 PKS are replaced by a 

methylmalonyl specific AT domain and an inactive KR domain, such as, for example, the 
AT and KR domains of extender module 6 of the rapamycin PKS. The resulting hybrid 
PKS produces 15-desmethoxy-15-methyl-16-oxo-FK-520, a neurotrophin compound. 

The eighth extender module of the FK-520 PKS includes a KS, an AT specific for 

15 2-hydroxymalonyl CoA, a KR, and an ACP. The recombinant DNA compounds of the 
invention that encode the eighth extender module of the FK-520 PKS and the 
corresponding polypeptides encoded thereby are useful for a variety of applications. In 
one embodiment, a DNA compound comprising a sequence that encodes the FK-520 
eighth extender module is inserted into a DNA compound that comprises the coding 

20 sequence for a heterologous PKS. The resulting construct, in which the coding sequence 
for a module of the heterologous PKS is either replaced by that for the eighth extender 
module of the FK-520 PKS or the latter is merely added to coding sequences for the 
modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the eighth extender 

25 module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 
sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

In another embodiment, a portion of the eighth extender module coding sequence 
is utilized in conjunction with other PKS coding sequences to create a hybrid module. In 

30 this embodiment, the invention provides, for example, either replacing the 2- 
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hydroxymalonyl CoA specific AT with a methylmalonyl CoA, ethylmalonyl Co A, or 
malonyl CoA specific AT; deleting or replacing the KR; and/or inserting a DH or a DH 
and an ER. In addition, the KS and/or ACP can be replaced with another KS and/or ACP. 
In each of these replacements, the heterologous KS, AT, DH, KR, ER, or ACP coding 
5 sequence can originate from a coding sequence for another module of the FK-520 PKS, 
from a coding sequence for a PKS that produces a polyketide other than FK-520, or from 
chemical synthesis. The resulting heterologous eighth extender module coding sequence 
can be utilized in conjunction with a PKS that synthesizes FK-520, an FK-520 derivative, 
or another polyketide. In similar fashion, the corresponding domains in a module of a 
10 heterologous PKS can be replaced by one or more domains of the eighth extender module 
of the FK-520 PKS. 

In an illustrative embodiment, the present invention provides a set of recombinant 
FK-520 PKS genes in which the coding sequences for the AT domain of the eighth 
extender module has been replaced with those encoding an AT domain for malonyl, 

15 methylmalonyl, or ethylmalonyl CoA from another PKS gene. The resulting PKS genes 
code for the expression of an FK-520 PKS that produces an FK-520 analog that lacks the 
C-13 methoxy group, having instead a hydrogen, methyl, or ethyl group at that position, 
respectively. Such analogs are preferred, because they are more slowly metabolized than 
FK-520. This recombinant eighth extender module coding sequence can be combined 

20 with other coding sequences to make additional compounds of the invention. In an 

illustrative embodiment, the present invention provides a recombinant FK-520 PKS that 
contains both this eighth extender module and the recombinant fourth extender module 
described above that comprises the coding sequence for the fourth extender module AT 
domain of the FK-506 PKS. The invention also provides recombinant host cells derived 

25 from FK-506 producing host cells that have been mutated to prevent production of FK- 
506 but that express this recombinant PKS and so synthesize the corresponding (C-13- 
desmethoxy) FK-506 derivative. In another embodiment, the present invention provides a 
recombinant FK-506 PKS in which the AT domain of module 8 has been replaced and 
thus produces this novel polyketide. 
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The ninth extender module of the FK-520 PKS includes a KS, an AT specific for 
methylmalonyl CoA, a KR, a DH, an ER, and an ACP. The recombinant DNA 
compounds of the invention that encode the ninth extender module of the FK-520 PKS 
and the corresponding polypeptides encoded thereby are useful for a variety of 
5 applications. In one embodiment, a DNA compound comprising a sequence that encodes 
the FK-520 ninth extender module is inserted into a DNA compound that comprises the 
coding sequence for a heterologous PKS. The resulting construct, in which the coding 
sequence for a module of the heterologous PKS is either replaced by that for the ninth 
extender module of the FK-520 PKS or the latter is merely added to coding sequences for 

1 0 the modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the ninth extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 
sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

1 5 In another embodiment, a portion of the ninth extender module coding sequence 

is utilized in conjunction with other PKS coding sequences to create a hybrid module. In 
this embodiment, the invention provides, for example, either replacing the methylmalonyl 
CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2-hydroxymalonyl CoA 
specific AT; deleting any one, two, or all three of the KR, DH, and ER; and/or replacing 

20 any one, two, or all three of the KR, DH, and ER with another KR, DH, and/or ER. In 
addition, the KS and/or ACP can be replaced with another KS and/or ACP. In each of 
these replacements, the heterologous KS, AT, DH, KR, ER, or ACP coding sequence can 
originate from a coding sequence for another module of the FK-520 PKS, from a coding 
sequence for a PKS that produces a polyketide other than FK-520, or from chemical 

25 synthesis. The resulting heterologous ninth extender module coding sequence can be 
utilized in conjunction with a PKS that synthesizes FK-520, an FK-520 derivative, or 
another polyketide. In similar fashion, the corresponding domains in a module of a 
heterologous PKS can be replaced by one or more domains of the ninth extender module 
of the FK-520 PKS. 
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The tenth extender module of the FK-520 PKS includes a KS, an AT specific for 
malonyl CoA, and an ACP. The recombinant DNA compounds of the invention that 
encode the tenth extender module of the FK-520 PKS and the corresponding polypeptides 
encoded thereby are useful for a variety of applications. In one embodiment, a DNA 
5 compound comprising a sequence that encodes the FK-520 tenth extender module is 
inserted into a DNA compound that comprises the coding sequence for a heterologous 
PKS. The resulting construct, in which the coding sequence for a module of the 
heterologous PKS is either replaced by that for the tenth extender module of the FK-520 
PKS or the latter is merely added to coding sequences for the modules of the 
10 heterologous PKS, provides a novel PKS coding sequence. In another embodiment, a 
DNA compound comprising a sequence that encodes the tenth extender module of the 
FK-520 PKS is inserted into a DNA compound that comprises the coding sequence for 
the remainder of the FK-520 PKS or a recombinant FK-520 PKS that produces an FK- 
520 derivative. 

1 5 In another embodiment, a portion or all of the tenth extender module coding 

sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 
malonyl CoA specific AT with a methylmalonyl CoA, ethylmalonyl CoA, or 2- 
hydroxymalonyl CoA specific AT; and/or inserting a KR, a KR and DH, or a KR, DH, 

20 and an ER. In addition, the KS and/or ACP can be replaced with another KS and/or ACP. 
In each of these replacements or insertions, the heterologous KS, AT, DH, KR, ER, or 
ACP coding sequence can originate from a coding sequence for another module of the 
FK-520 PKS, from a coding sequence for a PKS that produces a polyketide other than 
FK-520, or from chemical synthesis. The resulting heterologous tenth extender module 

25 coding sequence can be utilized in conjunction with a coding sequence for a PKS that 
synthesizes FK-520, an FK-520 derivative, or another polyketide. In similar fashion, the 
corresponding domains in a module of a heterologous PKS can be replaced by one or 
more domains of the tenth extender module of the FK-520 PKS. 

The FK-520 polyketide precursor produced by the action of the tenth extender 

30 module of the PKS is then attached to pipecolic acid and cyclized to form FK-520. The 
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enzyme FkbP is the NRPS like enzyme that catalyzes these reactions. FkbP also includes 
a thioesterase activity that cleaves the nascent FK-520 polyketide from the NRPS. The 
present invention provides recombinant DNA compounds that encode the fkbP gene and 
so provides recombinant methods for expressing the fkbP gene product in recombinant 
host cells. The recombinant jkbP genes of the invention include those in which the coding 
sequence for the adenylation domain has been mutated or replaced with coding sequences 
from other NRPS like enzymes so that the resulting recombinant FkbP incorporates a 
moiety other than pipecolic acid. For the construction of host cells that do not naturally 
produce pipecolic acid, the present invention provides recombinant DNA compounds that 
express the enzymes that catalyze at least some of the biosynthesis of pipecolic acid (see 
Nielsen et aL, 1991, Biochem. 30: 5789-96). The fkbL gene encodes a homolog of RapL, 
a lysine cyclodeaminase responsible in part for producing the pipecolate unit added to the 
end of the polyketide chain. The fkbB dead fkbL recombinant genes of the invention can be 
used in heterologous hosts to produce compounds such as FK-520 or, in conjunction with 
other PKS or NRPS genes, to produce known or novel polyketides and non-ribosmal 
peptides. 

The present invention also provides recombinant DNA compounds that encode 
the P450 oxidase and methyltransferase genes involved in the biosynthesis of FK-520. 
Figure 2 shows the various sites on the FK-520 polyketide core structure at which these 
enzymes act. By providing these genes in recombinant form, the present invention 
provides recombinant host cells that can produce FK-520. This is accomplished by 
introducing the recombinant PKS, P450 oxidase, and methyltransferase genes into a 
heterologous host cell. In a preferred embodiment, the heterologous host cell is 
Streptomyces coelicolor CH999 or Streptomyces lividans K4-1 14, as described in U.S. 
Patent No. 5,830,750 and U.S. patent application Serial Nos. 08/828,898, filed 31 Mar. 
1997, and 09/181,833, filed 28 Oct. 1998, each of which is incorporated herein by 
reference. In addition, by providing recombinant host cells that express only a subset of 
these genes, the present invention provides methods for making FK-520 precursor 
compounds not readily obtainable by other means. 
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In a related aspect, the present invention provides recombinant DNA compounds 
and vectors that are useful in generating, by homologous recombination, recombinant 
host cells that produce FK-520 precursor compounds. In this aspect of the invention, a 
native host cell that produces FK-520 is transformed with a vector (such as an SCP2* 
5 derived vector for Streptomyces host cells) that encodes one or more disrupted genes (i.e., 
a hydroxylase, a methyltransferase, or both) or merely flanking regions from those genes. 
When the vector integrates by homologous recombination, the native, functional gene is 
deleted or replaced by the non-functional recombinant gene, and the resulting host cell 
thus produces an FK-520 precursor. Such host cells can also be complemented by 
1 0 introduction of a modified form of the deleted or mutated non-functional gene to produce 
a novel compound. 

In one important embodiment, the present invention provides a hybrid PKS and 
the corresponding recombinant DNA compounds that encode those hybrid PKS enzymes. 
For purposes of the present invention a hybrid PKS is a recombinant PKS that comprises 

15 all or part of one or more modules and thioesterase/cyclase domain of a first PKS and all 
or part of one or more modules, loading module, and thioesterase/cyclase domain of a 
second PKS. In one preferred embodiment, the first PKS is all or part of the FK-520 
PKS, and the second PKS is only a portion or all of a non-FK-520 PKS. 

One example of the preferred embodiment is an FK-520 PKS in which the AT 

20 domain of module 8, which specifies a hydroxymalonyl Co A and from which the C-13 
methoxy group of FK-520 is derived, is replaced by an AT domain that specifies a 
malonyl, methylmalonyl, or ethylmalonyl CoA. Examples of such replacement AT 
domains include the AT domains from modules 3,12, and 13 of the rapaymycin PKS and 
from modules 1 and 2 of the erythromycin PKS. Such replacements, conducted at the 

25 level of the gene for the PKS, are illustrated in the examples below. Another illustrative 
example of such a hybrid PKS includes an FK-520 PKS in which the natural loading 
module has been replaced with a loading module of another PKS. Another example of 
such a hybrid PKS is an FK-520 PKS in which the AT domain of module three is 
replaced with an AT domain that binds methylmalonyl CoA. 
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In another preferred embodiment, the first PKS is most but not all of a non-FK- 
520 PKS, and the second PKS is only a portion or all of the FK-520 PKS. An illustrative 
example of such a hybrid PKS includes an erythromycin PKS in which an AT specific for 
methylmalonyl CoA is replaced with an AT from the FK-520 PKS specfic for malonyl 
5 CoA. 

Those of skill in the art will recognize that all or part of either the first or second 
PKS in a hybrid PKS of the invention need not be isolated from a naturally occurring 
source. For example, only a small portion of an AT domain determines its specificity. See 
U.S. provisional patent application Serial No. 60/091,526, incorporated herein by 

10 reference. The state of the art in DNA synthesis allows the artisan to construct de novo 
DNA compounds of size sufficient to construct a useful portion of a PKS module or 
domain. For purposes of the present invention, such synthetic DNA compounds are 
deemed to be a portion of a PKS. 

Thus, the hybrid modules of the invention are incorporated into a PKS to provide 

15 a hybrid PKS of the invention. A hybrid PKS of the invention can result not only: 

(i) from fusions of heterologous domain (where heterologous means the domains 
in that module are from at least two different naturally occurring modules) coding 
sequences to produce a hybrid module coding sequence contained in a PKS gene whose 
product is incorporated into a PKS, 

20 but also: 

(ii) from fusions of heterologous module (where heterologous module means two 
modules are adjacent to one another that are not adjacent to one another in naturally 
occurring PKS enzymes) coding sequences to produce a hybrid coding sequence 
contained in a PKS gene whose product is incorporated into a PKS, 

25 (iii) from expression of one or more FK-520 PKS genes with one or more non- 

FK-520 PKS genes, including both naturally occurring and recombinant non-FK-520 
PKS genes, and 

(iv) from combinations of the foregoing. 
Various hybrid PKSs of the invention illustrating these various alternatives are described 

30 herein. 
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Examples of the production of a hybrid PKS by co-expression of PKS genes from 
the FK-520 PKS and another non-FK-520 PKS include hybrid PKS enzymes produced by 
coexpression of FK-520 and rapamycin PKS genes. Preferably, such hybrid PKS 
enzymes are produced in recombinant Streptomyces host cells that produce FK-520 or 
5 FK-506 but have been mutated to inactivate the gene whose function is to be replaced by 
the rapamycin PKS gene introduced to produce the hybrid PKS. Particular examples 
include (i) replacement of the fkbC gene with the rapB gene; and (ii) replacement of the 
flcbA gene with the rapC gene. The latter hybrid PKS produces 13,15-didesmethoxy-FK- 
520, if the host cell is an FK-520 producing host cell, and 13,15-didesmethoxy-FK-506, 

10 if the host cell is an FK-506 producing host cell. The compounds produced by these 
hybrid PKS enzymes are immunosuppressants and neurotrophins but can be readily 
modified to act only as neurotrophins, as described in Example 6, below. 

Other illustrative hybrid PKS enzymes of the invention are prepared by replacing 
XhzfkbA gene of an FK-520 or FK-506 producing host cell with a hybrid fkbA gene in 

1 5 which: (a) the extender module 8 through 10, inclusive, coding sequences have been 
replaced by the coding sequnces for extender modules 12 to 14, inclusive, of the 
rapamycin PKS; and (b) the module 8 coding sequences have been replaced by the 
module 8 coding sequence of the rifamycin PKS. When expressed with the other, 
naturally occurring FK-520 or FK-506 PKS genes and the genes of the modification 

20 enzymes, the resulting hybrid PKS enzymes produce, respectively, (a) 13-desmethoxy- 
FK-520 or 13-desmethoxy-FK-506; and (b) 13-desmethoxy-13-methyl-FK-520 or 13- 
desmethoxy-13-methyl-FK-506. In a preferred embodiment, these recombinant PKS 
genes of the invention are introduced into the producing host cell by a vector such as 
pHU204, which is a plamsid pRM5 derivative that has the well-characterized SCP2* 

25 replicon, the colEl replicon, the tsr and bla resistance genes, and a cos site. This vector 
can be used to introduce the recombinant JkbA replacement gene in an FK-520 or FK-506 
producing host cell (or a host cell derived therefrom in which the endogenous fkbA gene 
has either been rendered inactive by mutation, deletion or homologous recombination 
with the gene that replaces it) to produce the desired hybrid PKS. 
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In constructing hybrid PKSs of the invention, certain general methods may be 
helpful. For example, it is often beneficial to retain the framework of the module to be 
altered to make the hybrid PKS. Thus, if one desires to add DH and ER functionalities to 
a module, it is often preferred to replace the KR domain of the original module with a 
KR, DH, and ER domain-containing segment from another module, instead of merely 
inserting DH and ER domains. One can alter the stereochemical specificity of a module 
by replacement of the KS domain with a KS domain from a module that specifies a 
different stereochemistry. See Lau et al, 1999, "Dissecting the role of acyltransferase 
domains of modular polyketide synthases in the choice and stereochemical fate of 
extender units," Biochemistry 3S(5):1643-1651, incorporated herein by reference. 
Stereochemistry can also be changed by changing the KR domain. Also, one can alter the 
specificity of an AT domain by changing only a small segment of the domain. See Lau et 
al, supra. One can also take advantage of known linker regions in PKS proteins to link 
modules from two different PKSs to create a hybrid PKS. See Gokhale etal, 16 Apr. 
1999, "Dissecting and Exploiting Intermodular Communication in Polyketide Synthases," 
Science 284: 482-485, incorporated herein by reference. 

The following Table lists references describing illustrative PKS genes and 
corresponding enzymes that can be utilized in the construction of the recombinant PKSs 
and the corresponding DNA compounds that encode them of the invention. Also 
presented are various references describing tailoring enzymes and corresponding genes 
that can be employed in accordance with the methods of the present invention. 
Avermectin 

U.S. Pat. No. 5,252,474 to Merck. 

MacNeil et al. , 1 993, Industrial Microorganisms: Basic and Applied Molecular 
Genetics, Baltz, Hegeman, & Skatrud, eds. (ASM), pp. 245-256, A Comparison of the 
Genes Encoding the Polyketide Synthases for Avermectin, Erythromycin, and 
Nemadectin. 

MacNeil et al, 1992, Gene 115: 1 19-125, Complex Organization of the 
Strep tomyces avermitilis genes encoding the avermectin polyketide synthase. 
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Ikeda et al, Aug. 1999, Organization of the biosynthetic gene cluster for the 
polyketide anthelmintic macrolide avermectin in Streptomyces avermitilis, Proc. Natl. 
Acad. Sci. USA 96: 9509-9514. 
Candicidin (FR008) 

Yhietal, 1994, Mol. Microbiol. 14: 163-172. 
Epothilone 

U.S. Pat. App. Serial No. 60/130,560, filed 22 April 1999. 
Erythromycin 

PCT Pub. No. 93/13663 to Abbott. 

US Pat. No. 5,824,513 to Abbott. 

Donadio et al., 1991, Science 252:675-9. 

Cortes et al., 8 Nov. 1990, Nature 348:116-%, An unusually large 
multifunctional polypeptide in the erythromycin producing polyketide synthase of 
Saccharopolyspora erythraea. 

Glycosylation Enzymes 

PCT Pat. App. Pub. No. 97/23630 to Abbott. 
FK-506 

Motamedi et al, 1998, The biosynthetic gene cluster for the macrolactone ring of 
the immunosuppressant FK-506, Eur. J. biochem. 256: 528-534. 

Motamedi et al, 1997, Structural organization of a multifunctional polyketide 
synthase involved in the biosynthesis of the macrolide immunosuppressant FK-506, Eur. 
J. Biochem. 244: 74-80. 

Methyltransferase 

US 5,264,355, issued 23 Nov. 1993, Methylating enzyme from 
Streptomyces MA6858. 31-O-desmethyl-FK-506 methyltransferase. 

Motamedi et al., 1996, Characterization of methyltransferase and 
hydroxylase genes involved in the biosynthesis of the immunosuppressants FK-506 and 
FK-520,y. Bacteriol. 178: 5243-5248. 
Streptomyces hygroscopic us 

U.S. patent application Serial No. 09/154,083, filed 16 Sep. 1998. 
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Lovastatin 

U.S. Pat. No. 5,744,350 to Merck. 
Narbomycin 

U.S. patent application Serial No. 60/107,093, filed 5 Nov. 1998, and Serial No. 
60/120,254, filed 16 Feb. 1999. 
Nemadectin 

MacNeil et al. , 1 993, supra. 
Niddamycin 

Kakavas et al, 1 997, Identification and characterization of the niddamycin 
polyketide synthase genes from Streptomyces caelestis, J. Bacteriol. 179: 7515-7522. 
Oleandomycin 

Swan et al, 1994, Characterisation of a Streptomyces antibioticus gene encoding 
a type I polyketide synthase which has an unusual coding sequence, Mol. Gen. Genet. 
242: 358-362. 

U.S. patent application Serial No. 60/120,254, filed 16 Feb. 1999. 

Olano et al, 1998, Analysis of a Streptomyces antibioticus chromosomal region 
involved in oleandomycin biosynthesis, which encodes two glycosyltransferases 
responsible for glycosylation of the macrolactone ring, Mol. Gen. Genet. 259(3): 299- 
308. 

Picromycin 

PCT patent application US99/15047, filed 2 Jul. 1999. 

Xue et al, 1998, Hydroxylation of macrolactones YC-17 and narbomycin is 
mediated by the;w£C-encoded cytochrome P450 in Streptomyces venezuelae, Chemistry 
& Biology 5(11): 661-667. 

Xue et al, Oct. 1998, A gene cluster for macrolide antibiotic biosynthesis in 
Streptomyces venezuelae: Architecture of metabolic diversity, Proc. Natl. Acad. Sci. 
USA 95: 12111 12116. 
Platenolide 

EP Pat. App. Pub. No. 791,656 to Lilly. 
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Raparaycin 

Schwecke et al, Aug. 1995, The biosynthetic gene cluster for the 
polyketide rapamycin, Proc. Natl. Acad. Sci. USA P2:7839-7843. 

Aparicio et al, 1996, Organization of the biosynthetic gene cluster for rapamycin 
in Streptomyces hygroscopicus: analysis of the enzymatic domains in the modular 
polyketide synthase, Gene 169: 9-16. 
Rifamycin 

August et al, 13 Feb. 1998, Biosynthesis of the ansamycin antibiotic rifamycin: 
deductions from the molecular analysis of the /-//biosynthetic gene cluster of 
Amycolatopsis mediterranei S669, Chemistry & Biology, 5(2): 69-79. 
Sorangium PKS 

U.S. patent application Serial No. 09/144,085, filed 31 Aug. 1998. 
Soraphen 

U.S. Pat. No. 5,716,849 to Novartis. 

Schupp et al, 1995, J. Bacteriology 177: 3673-3679. A Sorangium cellulosum 
(Myxobacterium) Gene Cluster for the Biosynthesis of the Macrolide Antibiotic 
Soraphen A: Cloning, Characterization, and Homology to Polyketide Synthase Genes 
from Actinomycetes. 
Spiramycin 

U.S. Pat. No. 5,098,837 to Lilly. 

Activator Gene 

U.S. Pat. No. 5,514,544 to Lilly. 
Tylosin 

EPPub.No. 791,655 to Lilly. 
U.S. Pat. No. 5,876,991 to Lilly. 

Kuhstoss et al, 1996, Gene 183:231-6., Production of a novel polyketide through 
the construction of a hybrid polyketide synthase. 
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Tailoring enzymes 

Merson-Davies and Cundliffe, 1994, Mol Microbiol 13: 349-355. Analysis of 
five tylosin biosynthetic genes from the tylBA region of the Streptomyces fradiae 
genome. 

As the above Table illustrates, there are a wide variety of polyketide synthase 
genes that serve as readily available sources of DNA and sequence information for use in 
constructing the hybrid PKS-encoding DNA compounds of the invention. Methods for 
constructing hybrid PKS-encoding DNA compounds are described without reference to 
the FK-520 PKS in PCT patent publication No. 98/51695; U.S. Patent Nos. 5,672,491 
and 5,712,146 and U.S. patent application Serial Nos. 09/073,538, filed 6 May 1998, and 
09/141,908, filed 28 Aug 1998, each of which is incorporated herein by reference. 

The hybrid PKS-encoding DNA compounds of the invention can be and often are 
hybrids of more than two PKS genes. Moreover, there are often two or more modules in 
the hybrid PKS in which all or part of the module is derived from a second (or third) 
PKS. Thus, as one illustrative example, the present invention provides a hybrid FK-520 
PKS that contains the naturally occurring loading module and FkbP as well as modules 
one, two, four, six, seven, and eight, nine, and ten of the FK-520 PKS and further 
contains hybrid or heterologous modules three and five. Hybrid or heterologous module 
three contains an AT domain that is specific of methylmalonyl CoA and can be derived 
for example, from the erythromycin or rapamycin PKS genes. Hybrid or heterologous 
module five contains an AT domain that is specific for malonyl CoA and can be derived 
for example, from the picromycin or rapamycin PKS genes. 

While an important embodiment of the present invention relates to hybrid PKS 
enzymes and corresponding genes, the present invention also provides recombinant FK- 
520 PKS genes in which there is no second PKS gene sequence present but which differ 
from the FK-520 PKS gene by one or more deletions. The deletions can encompass one 
or more modules and/or can be limited to a partial deletion within one or more modules. 
When a deletion encompasses an entire module, the resulting FK-520 derivative is at 
least two carbons shorter than the gene from which it was derived. When a deletion is 
within a module, the deletion typically encompasses a KR, DH, or ER domain, or both 
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DH and ER domains, or both KR and DH domains, or all three KR, DH, and ER 
domains. 

To construct a hybrid PKS or FK-520 derivative PKS gene of the invention, one 
can employ a technique, described in PCT Pub. No. 98/27203 and U.S. patent application 
5 Serial No. 08/989,332, filed 1 1 Dec. 1997, each of which is incorporated herein by 
reference, in which the large PKS gene is divided into two or more, typically three, 
segments, and each segment is placed on a separate expression vector. In this manner, 
each of the segments of the gene can be altered, and various altered segments can be 
combined in a single host cell to provide a recombinant PKS gene of the invention. This 

1 0 technique makes more efficient the construction of large libraries of recombinant PKS 
genes, vectors for expressing those genes, and host cells comprising those vectors. 

Thus, in one important embodiment, the recombinant DNA compounds of the 
invention are expression vectors. As used herein, the term expression vector refers to any 
nucleic acid that can be introduced into a host cell or cell-free transcription and 

1 5 translation medium. An expression vector can be maintained stably or transiently in a 
cell, whether as part of the chromosomal or other DNA in the cell or in any cellular 
compartment, such as a replicating vector in the cytoplasm. An expression vector also 
comprises a gene that serves to produce RNA that is translated into a polypeptide in the 
cell or cell extract. Furthermore, expression vectors typically contain additional 

20 functional elements, such as resistance-conferring genes to act as selectable markers. 

The various components of an expression vector can vary widely, depending on 
the intended use of the vector. In particular, the components depend on the host cell(s) in 
which the vector will be used or is intended to function. Vector components for 
expression and maintenance of vectors in E. coli are widely known and commercially 

25 available, as are vector components for other commonly used organisms, such as yeast 
cells and Streptomyces cells. 

In a preferred embodiment, the expression vectors of the invention are used to 
construct recombinant Streptomyces host cells that express a recombinant PKS of the 
invention. Preferred Streptomyces host cell/vector combinations of the invention include 

30 S. coelicolor CH999 and S, lividans K4-1 14 host cells, which do not produce 
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actinorhodin, and expression vectors derived from the pRMl and pRM5 vectors, as 
described in U.S. Patent No, 5,830,750 and U.S. patent application Serial Nos. 
08/828,898, filed 31 Mar. 1997, and 09/181,833, filed 28 Oct. 1998, each of which is 
incorporated herein by reference. 

The present invention provides a wide variety of expression vectors for use in 
Streptomyces. For replicating vectors, the origin of replication can be, for example and 
without limitation, a low copy number vector, such as SCP2* (see Hop wood et ai, 
Genetic Manipulation of Streptomyces: A Laboratory manual (The John Innes 
Foundation, Norwich, U.K., 1985); Lydiate et al, 1985, Gene 35: 223-235; and Kieser 
and Melton, 1988, Gene 65: 83-91, each of which is incorporated herein by reference), 
SLP1.2 (Thompson et ai, 1982, Gene 20: 51-62, incorporated herein by reference), and 
SG5(ts) (Muth et aU 1989, Mol Gen. Genet 219: 341-348, and Bierman et aU 1992, 
Gene 116: 43-49, each of which is incorporated herein by reference), or a high copy 
number vector, such as pIJlOl and pJVl (see Katz etai, 1983, 1 Gen. Microbiol. 129: 
2703-2714; Vara et a/., 1989, 1 Bacteriol 171: 5782-5781; and Servin-Gonzalez, 1993, 
Plasmid 30: 13 1-140, each of which is incorporated herein by reference). Generally, 
however, high copy number vectors are not preferred for expression of genes contained 
on large segments of DNA. For non-replicating and integrating vectors, it is useful to 
include at least an E. coli origin of replication, such as from pUC, plP, pi I, and pBR. For 
phage based vectors, the phages phiC31 and KC515 can be employed (see Hopwood et 
ai, supra). 

Typically, the expression vector will comprise one or more marker genes by 
which host cells containing the vector can be identified and/or selected. Useful antibiotic 
resistance conferring genes for use in Streptomyces host cells include the ermE (confers 
resistance to erythromycin and other macrolides and lincomycin), tsr (confers resistance 
to thiostrepton), aadA (confers resistance to spectinomycin and streptomycin), aacC4 
(confers resistance to apramycin, kanamycin, gentamicin, geneticin (G418), and 
neomycin), hyg (confers resistance to hygromycin), and vph (confers resistance to 
viomycin) resistance conferring genes. 
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The recombinant PKS gene on the vector will be under the control of a promoter, 
typically with an attendant ribosome binding site sequence. The present invention 
provides the endogenous promoters of the FK-520 PKS and related biosynthetic genes in 
recombinant form, and these promoters are preferred for use in the native hosts and in 
heterologous hosts in which the promoters function. A preferred promoter of the 
invention is the fkbO gene promoter, comprised in a sequence of about 270 bp between 
the start of the open reading frames of the JkbO and fkbB genes. The JkbO promoter is 
believed to be bi-directional in that it promotes transcription of the genes JkbO,JkbP, and 
jkbA in one direction and JkbB,JkbC, and fkbL in the other. Thus, in one aspect, the 
present invention provides a recombinant expression vector comprising the promoter of 
the fkbO gene of an FK-520 producing organism positioned to transcribe a gene other 
than JkbO. In a preferred embodiment the transcribed gene is an FK-520 PKS gene. In 
another preferred embodiment, the transcribed gene is a gene that encodes a protein 
comprised in a hybrid PKS. 

Heterologous promoters can also be employed and are preferred for use in host 
cells in which the endogenous FK-520 PKS gene promoters do not function or function 
poorly. A preferred heterologous promoter is the actl promoter and its attendant activator 
gene actII-ORF4, which is provided in the pRMl and pRM5 expression vectors, supra. 
This promoter is activated in the stationary phase of growth when secondary metabolites 
are normally synthesized. Other useful Streptomyces promoters include without limitation 
those from the ermE gene and the melCl gene, which act constitutively, and the tipA 
gene and the merA gene, which can be induced at any growth stage. In addition, the T7 
RNA polymerase system has been transferred to Streptomyces and can be employed in 
the vectors and host cells of the invention. In this system, the coding sequence for the T7 
RNA polymerase is inserted into a neutral site of the chromosome or in a vector under the 
control of the inducible merA promoter, and the gene of interest is placed under the 
control of the T7 promoter. As noted above, one or more activator genes can also be 
employed to enhance the activity of a promoter. Activator genes in addition to the actll- 
ORF4 gene discussed above include dnrl, redD, and ptpA genes (see U.S. patent 
application Serial No. 09/181,833, supra) to activate promoters under their control. 
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In addition to providing recombinant DNA compounds that encode the FK-520 
PKS, the present invention also provides DNA compounds that encode the ethylmalonyl 
CoA and 2-hydroxymalonyl CoA utilized in the synthesis of FK-520. Thus, the present 
invention also provides recombinant host cells that express the genes required for the 
5 biosynthesis of ethylmalonyl CoA and 2-hydroxymalonyl CoA. Figures 3 and 4 show the 
location of these genes on the cosmids of the invention and the biosynthetic pathway that 
produces ethylmalonyl CoA. 

For 2-hydroxymalonyl CoA biosynthesis, the fkbH,fkbI,JkbJ, andfkbK genes are 
sufficient to confer this ability on Streptomcyces host cells. For conversion of 2- 

10 hydroxymalonyl to 2-methoxymalonyl, the JkbG gene is also employed. While the 
complete coding sequence for flcbH is provided on the cosmids of the invention, the 
sequence for this gene provided herein may be missing a T residue, based on a 
comparison made with a similar gene cloned from the ansamitocin gene cluster by Dr. H. 
Floss. Where the sequence herein shows one T, there may be two, resulting in an 

1 5 extension of the fkbH reading frame to encode the amino acid sequence: 

MTIVKCLVWDLDNTLWRGTVLEDDEVVLTDEIREVITTLDDRGILQAVASKNDH 
DLAWERLERLGVAEYFVLARIGWGPKSQSVREIATELNFAPTTIAFIDDQPAERA 
EVAFHLPEVRCYPAEQAATLLSLPEFSPPVSTVDSRRRRLMYQAGFARDQAREA 
YSGPDEDFLRSLDLSMTIAPAGEEELSRVEELTLRTSQMNATGVHYSDADLRALL 

20 TDPAHEVLVVTMGDRFGPHGAVGIILLEKKPSTWHLKLLATSCRVVSFGAGATIL 
NWLTDQGARAGAHLVADFRRTDRNRMMEIAYRFAGFADSDCPCVSEVAGASA 
AGVERLHLEPSARPAPTTLTLTAADIAPVTVSAAG. 

For ethylmalonyl CoA biosynthesis, one requires only a crotonyl CoA reductase, 
which can be supplied by the host cell but can also be supplied by recombinant 

25 expression of the ffcbS gene of the present invention. To increase yield of ethylmalonyl 
CoA, one can also express the fkbE and flcbU genes as well. While such production can 
be achieved using only the recombinant genes above, one can also achieve such 
production by placing into the recombinant host cell a large segment of the DNA 
provided by the cosmids of the invention. Thus, for 2-hydroxymalonyl and 2- 

30 methoxymalonyl CoA biosynthesis, one can simply provide the cells with the segment of 

dc-176500 



PATENT 

AttyDkt: 30Q622Q0260Q 



-81 - 

DNA located on the left side of the FK-520 PKS genes shown in Figure 1. For 
ethylmalonyl CoA biosynthesis, one can simply provide the cells with the segment of 
DNA located on the right side of the FK-520 PKS genes shown in Figure 1 or, 
alternatively, both the right and left segments of DNA. 
5 The recombinant DNA expression vectors that encode these genes can be used to 

construct recombinant host cells that can make these important polyketide building 
blocks from cells that otherwise are unable to produce them. For example, Streptomyces 
coelicolor and Streptomyces lividans do not synthesisze ethylmalonyl CoA or 2- 
hydroxymalonyl CoA. The invention provides methods and vectors for constructing 

10 recombinant Streptomyces coelicolor and Streptomyces lividans that are able to 

synthesize either or both ethylmalonyl CoA and 2-hydroxymalonyl CoA. These host cells 
are thus able to make polyketides, those requiring these substrates, that cannot otherwise 
be made in such cells. 

In a preferred embodiment, the present invention provides recombinant 

15 Streptomyces host cells, such as S. coelicolor and S. lividans, that have been transformed 
with a recombinant vector of the invention that codes for the expression of the 
ethylmalonyl CoA biosynthetic genes. The resulting host cells produce ethylmalonyl 
CoA and so are preferred host cells for the production of polyketides produced by PKS 
enzymes that comprise one or more AT domains specific for ethylmalonyl CoA. 

20 Illustrative PKS enzymes of this type include the FK-520 PKS and a recombinant PKS in 
which one or more AT domains is specific for ethylmalonyl CoA. 

In a related embodiment, the present invention provides Streptomyces host cells in 
which one or more of the ethylmalonyl or 2-hydroxymalonyl biosynthetic genes have 
been deleted by homologous recombination or rendered inactive by mutation. For 

25 example, deletion or inactivation of the fkbG gene can prevent formation of the methoxyl 
groups at C-13 and C-15 of FK-520 (or, in the corresponding FK-506 producing cell, FK- 
506), leading to the production of 13,15-didesmethoxy-13,15-dihydroxy-FK-520 (or, in 
the corresponding FK-506 producing cell, 13,15-didesmethoxy-13,15-dihydroxy-FK- 
506). If the fkbG gene product acts on 2-hydroxymalonyl and the resulting 2- 

30 methoxymalonyl substrate is required for incorporation by the PKS, the AT domains of 
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modules 7 and 8 may bind malonyl CoA and methylmalonyl CoA. Such incorporation 
results in the production of a mixture of polyketides in which the methoxy groups at C-l 3 
and C-l 5 of FK-520 (or FK-506) are replaced by either hydrogen or methyl. 

This possibility of non-specific binding results from the construction of a hybrid 
5 PKS of the invention in which the AT domain of module 8 of the FK-520 PKS replaced 
the AT domain of module 6 of DEBS. The resulting PKS produced, in Streptomyces 
lividans, 6-dEB and 2-desmethyl-6-dEB, indicating that the AT domain of module 8 of 
the FK-520 PKS could bind malonyl CoA and methylmalonyl CoA substrates. Thus, one 
could possibly also prepare the 13,15-didesmethoxy-FK-520 and corresponding FK-506 

1 0 compounds of the invention by deleting or otherwise inactivating one or more or all of 
the genes required for 2-hydroxymalonyl CoA biosynthesis, i.e., the fkbH.Jkbl.fkbJ, and 
fkbK genes. In any event, the deletion or inactivation of one or more biosynthetic genes 
required for ethylmalonyl and/or 2-hydroxymalonyl production prevents the formation of 
polyketides requiring ethylmalonyl and/or 2-hydroxymalonyl for biosynthesis, and the 

1 5 resulting host cells are thus preferred for production of polyketides that do not require the 
same. 

The host cells of the invention can be grown and fermented under conditions 
known in the art for other purposes to produce the compounds of the invention. See, e.g., 
U.S. Patent Nos. 5,194,378; 5,1 16,756; and 5,494,820, incorporated herein by reference, 

20 for suitable fermentation processes. The compounds of the invention can be isolated from 
the fermentation broths of these cultured cells and purified by standard procedures. 
Preferred compounds of the invention include the following compounds: 13-desmethoxy- 
FK-506; 13-desmethoxy-FK-520; 13,15-didesmethoxy-FK-506; 13,15-didesmethoxy- 
FK-520; 1 3-desmethoxy- 1 8-hydroxy-FK-506; 1 3-desmethoxy- 1 8-hydroxy-FK-520; 

25 13,15-didesmethoxy-18-hydroxy-FK-506; and 13,15-didesmethoxy-18-hydroxy-FK-520. 
These compounds can be further modified as described for tacrolimus and FK-520 in 
U.S. Patent Nos. 5,225,403; 5,189,042; 5,164,495; 5,068,323; 4,980,466; and 4,920,218, 
incorporated herein by reference. 

Other compounds of the invention are shown in Figure 8, Parts A and B. In Figure 

30 8, Part A, illustrative C-32-substituted compounds of the invention are shown in two 
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columns under the heading R. The substituted compounds are preferred for topical 
administration and are applied to the dermis for treatment of conditions such as psoriasis. 
In Figure 8, Part B, illustrative reaction schemes for making the compounds shown in 
Figure 8, Part A, are provided. In the upper scheme in Figure 8, Part B, the C-32 
5 substitution is a tetrazole moiety, illustrative of the groups shown in the left column 
under R in Figure 8, Part A. In the lower scheme in Figure 8, Part B, the C-32 
substitution is a disubstituted amino group, where R3 and R4 can be any group similar to 
the illustrative groups shown attached to the amine in the right column under R in Figure 
8, Part A. While Figure 8 shows the C-32-substituted compounds in which the C-15- 

10 methoxy is present, the invention includes these C-32-substituted compounds in which C- 
15 is ethyl, methyl, or hydrogen. Also, while C-21 is shown as substituted with ethyl or 
allyl, the compounds of the invention includes the C-32-substituted compounds in which 
C-21 is substituted with hydrogen or methyl. 

To make these C-32-substituted compounds, Figure 8, Part B, provides illustrative 

15 reaction schemes. Thus, a selective reaction of the starting compound (see Figure 8, Part 
B, for an illustrative starting compound) with trifluoromethanesulfonic anhydride in the 
presence of a base yields the C-32 O-triflate derivative, as shown in the upper scheme of 
Figure 8, Part B. Displacement of the triflate with lH-tetrazole or triazole derivatives 
provides the C-32 tetrazole or teiazole derivative. As shown in the lower scheme of 

20 Figure 8, Part B, reacting the starting compound with p-nitrophenylchloroformate yields 
the correspoinding carbonate, which, upon displacement with an amino compound, 
provides the corresponding carbamate derivative. 

The compounds can be readily formulated to provide the pharmaceutical 
compositions of the invention. The pharmaceutical compositions of the invention can be 

25 used in the form of a pharmaceutical preparation, for example, in solid, semisolid, or 
liquid form. This preparation contains one or more of the compounds of the invention as 
an active ingredient in admixture with an organic or inorganic carrier or excipient 
suitable for external, enteral, or parenteral application. The active ingredient may be 
compounded, for example, with the usual non-toxic, pharmaceutically acceptable carriers 

30 for tablets, pellets, capsules, suppositories, solutions, emulsions, suspensions, and any 
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other form suitable for use. Suitable formulation processes and compositions for the 
compounds of the present invention are described with respect to tacrolimus in U.S. 
Patent Nos. 5,939,427; 5,922,729; 5,385,907; 5,338,684; and 5,260,301, incorporated 
herein by reference. Many of the compounds of the invention contain one or more chiral 
5 centers, and all of the stereoisomers are included within the scope of the invention, as 
pure compounds as well as mixtures of stereoisomers. Thus the compounds of the 
invention may be supplied as a mixture of stereoisomers in any proportion. 

The carriers which can be used include water, glucose, lactose, gum acacia, 
gelatin, mannitol, starch paste, magnesium trisilicate, talc, corn starch, keratin, colloidal 

10 silica, potato starch, urea, and other carriers suitable for use in manufacturing 

preparations, in solid, semi-solid, or liquified form. In addition, auxiliary stabilizing, 
thickening, and coloring agents and perfumes may be used. For example, the compounds 
of the invention may be utilized with hydroxypropyl methylcellulose essentially as 
described in U.S. Patent No. 4,916,138, incorporated herein by reference, or with a 

1 5 surfactant essentially as described in EPO patent publication No. 428, 1 69, incorporated 
herein by reference. 

Oral dosage forms may be prepared essentially as described by Hondo et al., 
1987, Transplantation Proceedings XIX, Supp. 6: 17-22, incorporated herein by 
reference. Dosage forms for external application may be prepared essentially as described 

20 in EPO patent publication No. 423,714, incorporated herein by reference. The active 
compound is included in the pharmaceutical composition in an amount sufficient to 
produce the desired effect upon the disease process or condition. 

For the treatment of conditions and diseases relating to immunosuppression or 
neuronal damage, a compound of the invention may be administered orally, topically, 

25 parenterally, by inhalation spray, or rectally in dosage unit formulations containing 

conventional non-toxic pharmaceutical^ acceptable carriers, adjuvant, and vehicles. The 
term parenteral, as used herein, includes subcutaneous injections, and intravenous, 
intramuscular, and intrasternal injection or infusion techniques. 

Dosage levels of the compounds of the present invention are of the order from 

30 about 0.01 mg to about 50 mg per kilogram of body weight per day, preferably from 
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about 0. 1 mg to about 10 mg per kilogram of body weight per day. The dosage levels are 
useful in the treatment of the above-indicated conditions (from about 0.7 mg to about 3.5 
mg per patient per day, assuming a 70 kg patient). In addition, the compounds of the 
present invention may be administered on an intermittent basis, i.e., at semi-weekly, 
5 weekly, semi-monthly, or monthly intervals. 

The amount of active ingredient that may be combined with the carrier materials 
to produce a single dosage form will vary depending upon the host treated and the 
particular mode of administration. For example, a formulation intended for oral 
administration to humans may contain from 0.5 mg to 5 g of active agent compounded 

10 with an appropriate and convenient amount of carrier material, which may vary from 
about 5 percent to about 95 percent of the total composition. Dosage unit forms will 
generally contain from about 0.5 mg to about 500 mg of active ingredient. For external 
administration, the compounds of the invention can be formulated within the range of, for 
example, 0.00001% to 60% by weight, preferably from 0.001% to 10% by weight, and 

15 most preferably from about 0.005% to 0.8% by weight. The compounds and 

compositions of the invention are useful in treating disease conditions using doses and 
administration schedules as described for tacrolimus in U.S. Patent Nos. 5,542,436; 
5,365,948; 5,348,966; and 5,196,437, incorporated herein by reference. The compounds 
of the invention can be used as single therapeutic agents or in combination with other 

20 therapeutic agents. Drugs that can be usefully combined with compounds of the invention 
include one or more immunosuppressant agents such as rapamycin, cyclosporin A, FK- 
506, or one or more neurotrophic agents. 

It will be understood, however, that the specific dosage level for any particular 
patient will depend on a variety of factors. These factors include the activity of the 

25 specific compound employed; the age, body weight, general health, sex, and diet of the 
subject; the time and route of administration and the rate of excretion of the drug; 
whether a drug combination is employed in the treatment; and the severity of the 
particular disease or condition for which therapy is sought. 
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A detailed description of the invention having been provided above, the following 
examples are given for the purpose of illustrating the present invention and shall not be 
construed as being a limitation on the scope of the invention or claims. 



5 Example 1 

Replacement of Methoxyl with Hydrogen or Methyl at C-13 of FK-520 
The C-13 methoxyl group is introduced into FK-520 via an AT domain in 
extender module 8 of the PKS that is specific for hydroxymalonyl and by methylation of 
the hydroxyl group by an S-adenosyl methionine (SAM) dependent methyltransferase. 

1 0 Metabolism of FK-506 and FK-520 primarily involves oxidation at the C-l 3 position into 
an inactive derivative that is further degraded by host P450 and other enzymes. The 
present invention provides compounds related in structure to FK-506 and FK-520 that do 
not contain the C-13 methoxy group and exhibit greater stability and a longer half-life in 
vivo. These compounds are useful medicaments due to their immunosuppressive and 

1 5 neurotrophic activities, and the invention provides the compounds in purified form and as 
pharmaceutical compositions. 

The present invention also provides the novel PKS enzymes that produce these 
novel compounds as well as the expression vectors and host cells that produce the novel 
PKS enzymes. The novel PKS enzymes include, among others, those that contain an AT 

20 domain specific for either malonyl CoA or methylmalonyl CoA in module 8 of the FK- 
506 and FK-520 PKS. This example describes the construction of recombinant DNA 
compounds that encode the novel FK-520 PKS enzymes and the transformation of host 
cells with those recombinant DNA compounds to produce the novel PKS enzymes and 
the polyketides produced thereby. 

25 To construct an expression cassette for performing module 8 AT domain 

replacements in the FK-520 PKS, a 4.6 kb Sphl fragment from the FK-520 gene cluster 
was cloned into plasmid pLitmus 38 (a cloning vector available from New England 
Biolabs). The 4.6 kb Sphl fragment, which encodes the ACP domain of module 7 
followed by module 8 through the KR domain, was isolated from an agarose gel after 

30 digesting the cosmid pKOS65-C3 1 with Sph I. The clone having the insert oriented so 
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the single Sad site was nearest to the Spel end of the polylinker was identified and 

designated as plasmid pKOS60-21-67. To generate appropriate cloning sites, two linkers 

were ligated sequentially as follows. First, a linker was ligated between the Spel and 

Sacl sites to introduce a Bglil site at the 5' end of the cassette, to eliminate interfering 

5 polylinker sites, and to reduce the total insert size to 4.5 kb (the limit of the phage 

KC515). The ligation reactions contained 5 picomolar unphosphorylated linker DNA and 

0.1 picomolar vector DNA, i.e., a 50-fold molar excess of linker to vector. The linker had 

the following sequence: 

5 '-CTAGTGGGC AGATCTGGC AGCT-3 ' 
1 0 3 '-ACCCGTCTAGACCG-5 ' 

The resulting plasmid was designated pKOS60-27-l. 

Next, a linker of the following sequence was ligated between the unique Sphl and 

Aflll sites of plasmid pKOS60-27-l to introduce an Nsil site at the 3' end of the module 8 

cassette. The linker employed was: 

15 5 ' -GGG ATGC ATGGC-3 ' 

S'-GTACCCCTACGTACCGAATT-S' 

The resulting plasmid was designated pKOS60-29-55. 

To allow in-frame insertions of alternative AT domains, sites were engineered at 

the 5' end (Avr II or Nhe I) and 3' end (Xho I) of the AT domain using the polymerase 

20 chain reaction (PCR) as follows. Plasmid pKOS60-29-55 was used as a template for the 

PCR and sequence 5' to the AT domain was amplified with the primers SpeBgl-fwd and 

either Avr-rev or Nhe-rev: 

SpeBgl-fwd 5 '-CGACTCACTAGTGGGC AGATCTGG-3 5 

Avr-rev 5'-CACGCCTAGGCCGGTCGGTCTCGGGCCAC-3' 

25 Nhe-rev 5 5 -GCGGCTAGCTGCTCGCCCATCGCGGGATGC-3 5 

The PCR included, in a 50 jal reaction, 5 \il of lOx Pfu polymerase buffer 

(Stratagene), 5 jil lOx z-dNTP mixture (2 mM dATP, 2 mM dCTP, 2 mM dTTP, 1 mM 

dGTP, 1 mM 7-deaza-GTP), 5 \i\ DMSO, 2 ^1 of each primer (10 \iM), 1 ^1 of template 

DNA (0.1 |ig/|il), and 1 \il of cloned Pfu polymerase (Stratagene). The PCR conditions 

30 were 95°C for 2 min., 25 cycles at 95°C for 30 sec, 60°C for 30 sec, and 72°C for 4 
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min., followed by 4 min. at 72°C and a hold at 0°C. The amplified DNA products and 
the Litmus vectors were cut with the appropriate restriction enzymes (BgUl and ^4vrII or 
Spel and Nhel), and cloned into either pLitmus 28 or pLitmus38 (New England Biolabs), 
respectively, to generate the constructs designated pKOS60-37-4 and pKOS60-37-2, 
5 respectively. 

Plasmid pKOS60-29-55 was again used as a template for PCR to amplify 
sequence 3' to the AT domain using the primers BsrXho-fwd and NsiAfl-rev: 

BsrXho-fwd 5 ' -GATGT ACAGCTCGAGTCGGC ACGCCCGGCCGC ATC-3 ' 

NsiAfl-rev 5 5 -CGACTCACTTAAGCCATGCATCC-3 ' 
10 PCR conditions were as described above. The PCR fragment was cut with BsrGl 

andAflll, gel isolated, and ligated into pKOS60-37-4 cut withi4$p718 and Aflll and 
inserted into pKOS60-37-2 cut with BsrGl and Aflll, to give the plasmids pKOS60-39-l 
and pKOS60-39-13, respectively. These two plasmids can be digested with Avrll and 
Xhol or Nhel and Xhol, respectively, to insert heterologous AT domains specific for 
1 5 malonyl, methylmalonyl, ethylmalonyl, or other extender units. 

Malonyl and methylmalonyl-specific AT domains were cloned from the 
rapamycin cluster using PCR amplification with a pair of primers that introduce an Avrll 
or Nhel site at the 5' end and an Xhol site at the 3' end. The PCR conditions were as 
given above and the primer sequences were as follows: 

20 

RATN1 5 '-ATCCTAGGCGGGCRGG YGTGTCGTCCTTCGG-3 ' 
(3' end of Rap KS sequence and universal for malonyl and methylmalonyl Co A), 
RATMN2 5'-ATGCTAGCCGCCGCGTTCCCCGTCTTCGCGCG-3 ' 
(Rap AT shorter version 5'- sequence and specific for malonyl Co A), 
25 RATMMN2 5 ' -ATGCT AGCGGATTCGTCGGTGGTGTTCGCCGA-3 ' 

(Rap AT shorter version 5'- sequence and specific for methylmalonyl Co A), and 
RATC 5 '-ATCTCGAGCC AGTASCGCTGGTG YTGGAAGG-3 ' 
(Rap DH 5'- sequence and universal for malonyl and methylmalonyl Co A). 
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MMN2 - Nhel 
Nl-Avrll;MM2-Nhel 



KS 




AT 











Any Rap Module 



Xhol-C 



10 Because of the high sequence similarity in each module of the rapamycin cluster, 

each primer was expected to prime any of the AT domains. PCR products representing 
ATs specific for malonyl or methylmalonyl extenders were identified by sequencing 
individual cloned PCR products. Sequencing also confirmed that the chosen clones 
contained no cloning artifacts. Examples of hybrid modules with the rapamycin AT 12 

1 5 and ATI 3 domains are shown in a separate figure. 

The Avrll-Xhol restriction fragment that encodes module 8 of the FK-520 PKS 
with the endogenous AT domain replaced by the AT domain of module 12 of the 
rapamycin PKS has the DNA sequence and encodes the amino acid sequence shown 
below. The AT of rap module 12 is specific for incorporation of malonyl units. 

20 AGATCTGGCAGCTCGCCGAAGCGCTGCTGACGCTCGTCCGGGAGAGCACC 50 
IWQLAEALLTLVREST 
GCCGCCGTGCTCGGCCACGTGGGTGGCGAGGACATCCCCGCGACGGCGGC 100 

AAVLGHVGGED I PATAA 
GTTCAAGGACCTCGGCATCGACTCGCTCACCGCGGTCCAGCTGCGCAACG 150 
25 FKDLGI DSLTAVQLRN 

CCCTCACCGAGGCGACCGGTGTGCGGCTGAACGCCACGGCGGTCTTCGAC 200 
ALTEATGVRLNATAVFD 
TTCCCGACCCCGCACGTGCTCGCCGGGAAGCTCGGCGACGAACTGACCGG 250 
FPTPHVLAGKLGDELTG 
30 CACCCGCGCGCCCGTCGTGCCCCGGACCGCGGCCACGGCCGGTGCGCACG 300 
TRAPVVPRTAATAGAH 
ACGAGCCGCTGGCGATCGTGGGAATGGCCTGCCGGCTGCCCGGCGGGGTC 350 
DEPLAIVGMACRLPGGV 
GCGTCACCCGAGGAGCTGTGGCACCTCGTGGCATCCGGCACCGACGCCAT 400 
35 ASPEELWHLVASGT DAI 

CACGGAGTTCCCGACGGACCGCGGCTGGGACGTCGACGCGATCTACGACC 450 

TEFPTDRGWDVDAIYD 
CGGACCCCGACGCGATCGGCAAGACCTTCGTCCGGCACGGTGGCTTCCTC 500 
PDPDAIGKTFVRHGGFL 
40 ACCGGCGCGACAGGCTTCGACGCGGCGTTCTTCGGCATCAGCCCGCGCGA 550 
TGATGFDAAFFGIS PRE 
GGCCCTCGCGATGGACCCGCAGCAGCGGGTGCTCCTGGAGACGTCGTGGG 600 

ALAMDPQQRVLLETSW 
AGGCGTTCGAAAGCGCCGGCATCACCCCGGACTCGACCCGCGGCAGCGAC 650 
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EAFESAGITPDSTRGSD 
ACCGGCGTGTTCGTCGGCGCCTTCTCCTACGGTTACGGCACCGGTGCGGA 7 00 

TGVFVGAFSYGYGTGAD 
CACCGACGGCTTCGGCGCGACCGGCTCGCAGACCAGTGTGCTCTCCGGCC 7 50 
5 TDGFGATGSQTSVLSG 

GGCTGTCGTACTTCTACGGTCTGGAGGGTCCGGCGGTCACGGTCGACACG 800 
RLSYFYGLEGPAVTVDT 
GCGTGTTCGTCGTCGCTGGTGGCGCTGCACCAGGCCGGGCAGTCGCTGCG 850 
ACSSSLVALHQAGQSLR 
10 CTCCGGCGAATGCTCGCTCGCCCTGGTCGGCGGCGTCACGGTGATGGCGT 900 
SGECSLALVGGVTVMA 
CTCCCGGCGGCTTCGTGGAGTTCTCCCGGCAGCGCGGCCTCGCGCCGGAC 950 
SPGGFVEFSRQRGLAPD 
GGCCGGGCGAAGGCGTTCGGCGCGGGTGCGGACGGCACGAGCTTCGCCGA 1000 
15 GRAKAFGAGADGTS FAE 

GGGTGCCGGTGTGCTGATCGTCGAGAGGCTCTCCGACGCCGAACGCAACG 1050 

GAGVLIVERLSDAERN 
GTCACACCGTCCTGGCGGTCGTCCGTGGTTCGGCGGTCAACCAGGATGGT 110 0 
GHTVLAVVRGSAVNQDG 
20 GCCTCCAACGGGCTGTCGGCGCCGAACGGGCCGTCGCAGGAGCGGGTGAT 1150 
ASNGLSAPNGPSQERVI 
CCGGCAGGCCCTGGCCAACGCCGGGCTCACCCCGGCGGACGTGGACGCCG 1200 

RQALANAGLT PADVDA 
TCGAGGCCCACGGCACCGGCACCAGGCTGGGCGACCCCATCGAGGCACAG 1250 
25 VEAHGTGTRLGDPIEAQ 

GCGGTACTGGCCACCTACGGACAGGAGCGCGCCACCCCCCTGCTGCTGGG 1300 

AVLATYGQERATPLLLG 
CTCGCTGAAGTCCAACATCGGCCACGCCCAGGCCGCGTCCGGCGTCGCCG 1350 
SLKSNIGHAQAASGVA 
30 GCATCATCAAGATGGTGCAGGCCCTCCGGCACGGGGAGCTGCCGCCGACG 14 00 
GI IKMV-QALRHGELPPT 
CTGCACGCCGACGAGCCGTCGCCGCACGTCGACTGGACGGCCGGCGCCGT 14 50 

LHADEPS PHVDWTAGAV 
CGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGGCCTAGGC 1500 
35 ELLTSARPWPETDRPR 

GGGCAGGCGTGTCGTCCTTCGGGATCAGTGGCACCAACGCCCACGTCATC 1550 
RAGVSSFGISGTNAHVI 
CTGGAAAGCGCACCCCCCACTCAGCCTGCGGACAACGCGGTGATCGAGCG 1600 
LESAPPTQPADNAVIER 
40 GGCACCGGAGTGGGTGCCGTTGGTGATTTCGGCCAGGACCCAGTCGGCTT 1650 
APEWVPLVISARTQSA 
TGACTGAGCACGAGGGCCGGTTGCGTGCGTATCTGGCGGCGTCGCCCGGG 1700 
LTEHEGRLRAYLAASPG 
GTGGATATGCGGGCTGTGGCATCGACGCTGGCGATGACACGGTCGGTGTT 1750 
45 VDMRAVASTLAMTRSVF 

CGAGCACCGTGCCGTGCTGCTGGGAGATGACACCGTCACCGGCACCGCTG 1800 

EHRAVLLGDDTVTGTA 
TGTCTGACCCTCGGGCGGTGTTCGTCTTCCCGGGACAGGGGTCGCAGCGT 1850 
VSDPRAVFVFPGQGSQR 
50 GCTGGCATGGGTGAGGAACTGGCCGCCGCGTTCCCCGTCTTCGCGCGGAT 1900 
AGMGEELAAAFPVFARI 
CCATCAGCAGGTGTGGGACCTGCTCGATGTGCCCGATCTGGAGGTGAACG 1950 

HQQVWDLLDVPDLEVN 
AGACCGGTTACGCCCAGCCGGCCCTGTTCGCAATGCAGGTGGCTCTGTTC 2000 
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ETGYAQPALFAMQVALF 
GGGCTGCTGGAATCGTGGGGTGTACGACCGGACGCGGTGATCGGCCATTC 2050 

GLLESWGVRPDAVI GHS 
GGTGGGTGAGCTTGCGGCTGCGTATGTGTCCGGGGTGTGGTCGTTGGAGG 2100 
5 VGELAAAYVS GVWS LE 

ATGCCTGCACTTTGGTGTCGGCGCGGGCTCGTCTGATGCAGGCTCTGCCC 2150 
DACTLVSARARLMQALP 
GCGGGTGGGGTGATGGTCGCTGTCCCGGTCTCGGAGGATGAGGCCCGGGC 2200 
AGGVMVAVPVS EDEARA 
10 CGTGCTGGGTGAGGGTGTGGAGATCGCCGCGGTCAACGGCCCGTCGTCGG 2250 
VLGEGVEIAAVNGPSS 
TGGTTCTCTCCGGTGATGAGGCCGCCGTGCTGCAGGCCGCGGAGGGGCTG 2300 
VVLSGDEAAVLQAAEGL 
GGGAAGTGGACGCGGCTGGCGACCAGCCACGCGTTCCATTCCGCCCGTAT 2350 
15 GKWTRLATS HAFH SARM 

GGAACCCATGCTGGAGGAGTTCCGGGCGGTCGCCGAAGGCCTGACCTACC 2400 

EPMLEEFRAVAEGLTY 
GGACGCCGCAGGTCTCCATGGCCGTTGGTGATCAGGTGACCACCGCTGAG 2450 
RTPQVSMAVGDQVTTAE 
20 TACTGGGTGCGGCAGGTCCGGGACACGGTCCGGTTCGGCGAGCAGGTGGC 2500 
YWVRQVRDTVRFGEQVA 
CTCGTACGAGGACGCCGTGTTCGTCGAGCTGGGTGCCGACCGGTCACTGG 2550 

SYEDAVFVELGADRSL 
CCCGCCTGGTCGACGGTGTCGCGATGCTGCACGGCGACCACGAAATCCAG 2 600 
25 ARLVDGVAMLHGDHEIQ 

GCCGCGATCGGCGCCCTGGCCCACCTGTATGTCAACGGCGTCACGGTCGA 2650 

AAIGALAHLYVNGVTVD 
CTGGCCCGCGCTCCTGGGCGATGCTCCGGCAACACGGGTGCTGGACCTTC 27 00 
WPALLGDAPATRVLDL 
30 CGACATACGCCTTCCAGCACCAGCGCTACTGGCTCGAGTCGGCACGCCCG 27 50 
PTYAFQHQRYWLESARP 
GCCGCATCCGACGCGGGCCACCCCGTGCTGGGCTCCGGTATCGCCCTCGC 2800 

AASDAGHPVLGSGIALA 
CGGGTCGCCGGGCCGGGTGTTCACGGGTTCCGTGCCGACCGGTGCGGACC 2850 
35 GSPGRVFTGSVPTGAD 

GCGCGGTGTTCGTCGCCGAGCTGGCGCTGGCCGCCGCGGACGCGGTCGAC 2 900 
RAVFVAELALAAADAVD 
TGCGCCACGGTCGAGCGGCTCGACATCGCCTCCGTGCCCGGCCGGCCGGG 2950 
CATVERLDIASVPGRPG 
40 CCATGGCCGGACGACCGTACAGACCTGGGTCGACGAGCCGGCGGACGACG 3000 
HGRTTVQTWVDEPADD 
GCCGGCGCCGGTTCACCGTGCACACCCGCACCGGCGACGCCCCGTGGACG 3050 
GRRRFTVHTRTGDAPWT 
CTGCACGCCGAGGGGGTGCTGCGCCCCCATGGCACGGCCCTGCCCGATGC 3100 
45 LHAEGVLRPHGTAL PDA 

GGCCGACGCCGAGTGGCCCCCACCGGGCGCGGTGCCCGCGGACGGGCTGC 3150 

ADAEWPPPGAVPADGL 
CGGGTGTGTGGCGCCGGGGGGACCAGGTCTTCGCCGAGGCCGAGGTGGAC 3200 
PGVWRRGDQVFAEAEVD 
50 GGACCGGACGGTTTCGTGGTGCACCCCGACCTGCTCGACGCGGTCTTCTC 3250 
GPDGFVVHPDLLDAVFS 
CGCGGTCGGCGACGGAAGCCGCCAGCCGGCCGGATGGCGCGACCTGACGG 3300 

AVGDGSRQPAGWRDLT 
TGCACGCGTCGGACGCCACCGTACTGCGCGCCTGCCTCACCCGGCGCACC 3350 
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V H A S DATVLRACLTRRT 
GACGGAGCCATGGGATTCGCCGCCTTCGACGGCGCCGGCCTGCCGGTACT 3400 

DGAMGFAAFDGAGLPVL 
CACCGCGGAGGCGGTGACGCTGCGGGAGGTGGCGTCACCGTCCGGCTCCG 3450 
5 TAEAVTLREVAS PSGS 

AGGAGTCGGACGGCCTGCACCGGTTGGAGTGGCTCGCGGTCGCCGAGGCG 3500 
EES DGLHRLEWLAVAEA 
GTCTACGACGGTGACCTGCCCGAGGGACATGTCCTGATCACCGCCGCCCA 3550 
VYDGDLPEGHVLITAAH 
1 0 CCCCGACGACCCCGAGGACATACCCACCCGCGCCCACACCCGCGCCACCC 3600 
PDDPEDI PTRAHTRAT 
GCGTCCTGACCGCCCTGCAACACCACCTCACCACCACCGACCACACCCTC 3650 
RVLTALQHHLTTTDHTL 
ATCGTCCACACCACCACCGACCCCGCCGGCGCCACCGTCACCGGCCTCAC 3700 
15 IVHTTTDPAGATVTGLT 

CCGCACCGCCCAGAACGAACACCCCCACCGCATCCGCCTCATCGAAACCG 3750 

RTAQNEHPHRIRLIET 
ACCACCCCCACACCCCCCTCCCCCTGGCCCAACTCGCCACCCTCGACCAC 3800 
DHPHTPLPLAQLATLDH 
20 CCCCACCTCCGCCTCACCCACCACACCCTCCACCACCCCCACCTCACCCC 3850 
PHLRLTHHTLHHPHLTP 
CCTCCACACCACCACCCCACCCACCACCACCCCCCTCAACCCCGAACACG 3900 

LHTTTPPTTTPLNPEH 
CCATCATCATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGC 3950 
25 AI I ITGGSGTLAGILAR 

CACCTGAACCACCCCCACACCTACCTCCTCTCCCGCACCCCACCCCCCGA 4000 

HLNHPHTYLLSRTPPPD 
CGCCACCCCCGGCACCCACCTCCCCTGCGACGTCGGCGACCCCCACCAAC 4 050 
ATPGTHLPCDVGDPHQ 
30 TCGCCACCACCCTCACCCACATCCCCCAACCCCTCACCGCCATCTTCCAC 4100 
LATTLTHIPQPLTAIFH 
ACCGCCGCCACCCTCGACGACGGCATCCTCCACGCCCTCACCCCCGACCG 4150 

TAATLDDGILHALTPDR 
CCTCACCACCGTCCTCCACCCCAAAGCCAACGCCGCCTGGCACCTGCACC 4200 
35 LTTVLHPKANAAWHLH 

ACCTCACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCC 4250 
HLTQNQPLTHFVLYSSA 
GCCGCCGTCCTCGGCAGCCCCGGACAAGGAAACTACGCCGCCGCCAACGC 4300 
AAVLGS PGQGNYAAANA 
40 CTTCCTCGACGCCCTCGCCACCCACCGCCACACCCTCGGCCAACCCGCCA 4350 
FLDALATHRHTLGQPA 
CCTCCATCGCCTGGGGCATGTGGCACACCACCAGCACCCTCACCGGACAA 4 400 
TSIAWGMWHTTSTLTGQ 
CTCGACGACGCCGACCGGGACCGCATCCGCCGCGGCGGTTTCCTCCCGAT 4 450 
45 LDDADRDRIRRGGFLPI 
CACGGACGACGAGGGCATGGGGATGCAT 
T D D E G 

The Avrll-Xhol restriction fragment that encodes module 8 of the FK-520 PKS 
50 with the endogenous AT domain replaced by the AT domain of module 13 (specific for 
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methylmalonyl CoA) of the rapamycin PKS has the DNA sequence and encodes the 
amino acid sequence shown below. 

AGATCTGGCAGCTCGCCGAAGCGCTGCTGACGCTCGTCCGGGAGAGCACC 5 0 
QLAEALLTLVREST 
5 GCCGCCGTGCTCGGCCACGTGGGTGGCGAGGACATCCCCGCGACGGCGGC 100 
AAVLGHVGGED I PATAA 
GTTCAAGGACCTCGGCATCGACTCGCTCACCGCGGTCCAGCTGCGCAACG 150 

FKDLGIDSLTAVQLRN 
CCCTCACCGAGGCGACCGGTGTGCGGCTGAACGCCACGGCGGTCTTCGAC 2 00 
10 ALTEATGVRLNATAVFD 

TTCCCGACCCCGCACGTGCTCGCCGGGAAGCTCGGCGACGAACTGACCGG 250 

FPTPHVLAGKLGDELTG 
CACCCGCGCGCCCGTCGTGCCCCGGACCGCGGCCACGGCCGGTGCGCACG 300 
TRAPVVPRTAATAGAH 
15 ACGAGCCGCTGGCGATCGTGGGAATGGCCTGCCGGCTGCCCGGCGGGGTC 350 
DEPLAIVGMACRLPGGV 
GCGTCACCCGAGGAGCTGTGGCACCTCGTGGCATCCGGCACCGACGCCAT 4 00 

ASPEELWHLVASGTDAI 
CACGGAGTTCCCGACGGACCGCGGCTGGGACGTCGACGCGATCTACGACC 450 
20 T EFPTDRGWDVDAI YD 

CGGACCCCGACGCGATCGGCAAGACCTTCGTCCGGCACGGTGGCTTCCTC 500 
PDPDAIGKTFVRHGGFL 
ACCGGCGCGACAGGCTTCGACGCGGCGTTCTTCGGCATCAGCCCGCGCGA 550 
TGATGFDAAFFGI S PRE 
25 GGCCCTCGCGATGGACCCGCAGCAGCGGGTGCTCCTGGAGACGTCGTGGG 600 
ALAMDPQQRVLLETSW 
AGGCGTTCGAAAGCGCCGGCATCACCCCGGACTCGACCCGCGGCAGCGAC 650 
EAFESAGITPDSTRGSD 
ACCGGCGTGTTCGTCGGCGCCTTCTCCTACGGTTACGGCACCGGTGCGGA 7 00 
30 TGVFVGAFSYGYGTGAD 

CACCGACGGCTTCGGCGCGACCGGCTCGCAGACCAGTGTGCTCTCCGGCC 7 50 

TDGFGATGSQTSVLSG 
GGCTGTCGTACTTCTACGGTCTGGAGGGTCCGGCGGTCACGGTCGACACG 800 
RLSYFYGLEGPAVTVDT 
35 GCGTGTTCGTCGTCGCTGGTGGCGCTGCACCAGGCCGGGCAGTCGCTGCG 850 
ACSSSLVALHQAGQSLR 
CTCCGGCGAATGCTCGCTCGCCCTGGTCGGCGGCGTCACGGTGATGGCGT 900 

SGECSLALVGGVTVMA 
CTCCCGGCGGCTTCGTGGAGTTCTCCCGGCAGCGCGGCCTCGCGCCGGAC 950 
40 S PGGFVEFSRQRGLAPD 

GGCCGGGCGAAGGCGTTCGGCGCGGGTGCGGACGGCACGAGCTTCGCCGA 1000 

GRAKAFGAGADGTS FAE 
GGGTGCCGGTGTGCTGATCGTCGAGAGGCTCTCCGACGCCGAACGCAACG 1050 
GAGVLIVERLS DAERN 
45 GTCACACCGTCCTGGCGGTCGTCCGTGGTTCGGCGGTCAACCAGGATGGT 1100 
GHTVLAVVRGSAVNQDG 
GCCTCCAACGGGCTGTCGGCGCCGAACGGGCCGTCGCAGGAGCGGGTGAT 1150 

ASNGLSAPNGPSQERVI 
CCGGCAGGCCCTGGCCAACGCCGGGCTCACCCCGGCGGACGTGGACGCCG 1200 
50 RQALANAGLTPADVDA 

TCGAGGCCCACGGCACCGGCACCAGGCTGGGCGACCCCATCGAGGCACAG 1250 
VEAHGTGTRLGDPIEAQ 
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GCGGTACTGGCCACCTACGGACAGGAGCGCGCCACCCCCCTGCTGCTGGG 1300 

AVLATYGQERATPLLLG 
CTCGCTGAAGTCCAACATCGGCCACGCCCAGGCCGCGTCCGGCGTCGCCG 1350 
SLKSNIGHAQAASGVA 
5 GCATCATCAAGATGGTGCAGGCCCTCCGGCACGGGGAGCTGCCGCCGACG 14 00 
GIIKMVQALRHGELPPT 
CTGCACGCCGACGAGCCGTCGCCGCACGTCGACTGGACGGCCGGCGCCGT 14 50 

LHADEPS PHVDWTAGAV 
CGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGGCCTAGGC 1500 
10 ELLTSARPWPETDRPR 

GGGCGGGCGTGTCGTCCTTCGGAGTCAGCGGCACCAACGCCCACGTCATC 1550 
RAGVSS FGVSGTNAHVI 
CTGGAGAGCGCACCCCCCGCTCAGCCCGCGGAGGAGGCGCAGCCTGTTGA 1600 
LESAPPAQPAEEAQPVE 
15 GACGCCGGTGGTGGCCTCGGATGTGCTGCCGCTGGTGATATCGGCCAAGA 1650 
TPVVASDVLPLVISAK 
CCCAGCCCGCCCTGACCGAACACGAAGACCGGCTGCGCGCCTACCTGGCG 17 00 
TQPALTEHEDRLRAYLA 
GCGTCGCCCGGGGCGGATATACGGGCTGTGGCATCGACGCTGGCGGTGAC 17 50 
20 AS PGADI RAVASTLAVT 

ACGGTCGGTGTTCGAGCACCGCGCCGTACTCCTTGGAGATGACACCGTCA 1800 

RSVFEHRAVLLGDDTV 
CCGGCACCGCGGTGACCGACCCCAGGATCGTGTTTGTCTTTCCCGGGCAG 1850 
TGTAVTDPRIVFVFPGQ 
25 GGGTGGCAGTGGCTGGGGATGGGCAGTGCACTGCGCGATTCGTCGGTGGT 1900 
GWQWLGMGSALRDSSVV 
GTTCGCCGAGCGGATGGCCGAGTGTGCGGCGGCGTTGCGCGAGTTCGTGG 1950 

FAERMAECAAALREFV 
ACTGGGATCTGTTCACGGTTCTGGATGATCCGGCGGTGGTGGACCGGGTT 2000 
30 DWDLFTVLDDPAVVDRV 

GATGTGGTCCAGCCCGCTTCCTGGGCGATGATGGTTTCCCTGGCCGCGGT 2050 

DVVQPASWAMMVSLAAV 
GTGGCAGGCGGCCGGTGTGCGGCCGGATGCGGTGATCGGCCATTCGCAGG 2100 
WQAAGVRPDAVIGHSQ 
35 GTGAGATCGCCGCAGCTTGTGTGGCGGGTGCGGTGTCACTACGCGATGCC 2150 
GEIAAACVAGAVSLRDA 
GCCCGGATCGTGACCTTGCGCAGCCAGGCGATCGCCCGGGGCCTGGCGGG 2200 

ARIVTLRSQAIARGLAG 
CCGGGGCGCGATGGCATCCGTCGCCCTGCCCGCGCAGGATGTCGAGCTGG 2250 
40 RGAMASVALPAQDVEL 

TCGACGGGGCCTGGATCGCCGCCCACAACGGGCCCGCCTCCACCGTGATC 2300 
VDGAWIAAHNGPASTVI 
GCGGGCACCCCGGAAGCGGTCGACCATGTCCTCACCGCTCATGAGGCACA 2350 
AGT PEAVDHVLTAHEAQ 
45 AGGGGTGCGGGTGCGGCGGATCACCGTCGACTATGCCTCGCACACCCCGC 24 00 
GVRVRRITVDYASHTP 
ACGTCGAGCTGATCCGCGACGAACTACTCGACATCACTAGCGACAGCAGC 24 50 
HVELIRDELLDITSDSS 
TCGCAGACCCCGCTCGTGCCGTGGCTGTCGACCGTGGACGGCACCTGGGT 2500 
50 SQTPLVPWLSTVDGTWV 

CGACAGCCCGCTGGACGGGGAGTACTGGTACCGGAACCTGCGTGAACCGG 2550 

DSPLDGEYWYRNLREP 
TCGGTTTCCACCCCGCCGTCAGCCAGTTGCAGGCCCAGGGCGACACCGTG 2600 
VGFHPAVSQLQAQGDTV 
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TTCGTCGAGGTCAGCGCCAGCCCGGTGTTGTTGCAGGCGATGGACGACGA 2650 

FVEVSASPVLLQAMDDD 
TGTCGTCACGGTTGCCACGCTGCGTCGTGACGACGGCGACGCCACCCGGA 27 00 
VVTVATLRRDDGDATR 
5 TGCTCACCGCCCTGGCACAGGCCTATGTCCACGGCGTCACCGTCGACTGG 2750 
MLTALAQAYVHGVTVDW 
CCCGCCATCCTCGGCACCACCACAACCCGGGTACTGGACCTTCCGACCTA 2800 

PAILGTTTTRVLDLPTY 
CGCCTTCCAACACCAGCGGTACTGGCTCGAGTCGGCACGCCCGGCCGCAT 2850 
10 AFQHQRYWLESARPAA 

CCGACGCGGGCCACCCCGTGCTGGGCTCCGGTATCGCCCTCGCCGGGTCG 2900 
SDAGHPVLGSGIALAGS 
CCGGGCCGGGTGTTCACGGGTTCCGTGCCGACCGGTGCGGACCGCGCGGT 2950 
PGRVFTGSVPTGADRAV 
15 GTTCGTCGCCGAGCTGGCGCTGGCCGCCGCGGACGCGGTCGACTGCGCCA 3000 
FVAELALAAADAVDCA 
CGGTCGAGCGGCTCGACATCGCCTCCGTGCCCGGCCGGCCGGGCCATGGC 3050 
TVERLDIASVPGRPGHG 
CGGACGACCGT ACAGACCTGGGTCGACGAGCCGGCGGACGACGGCCGGCG 3100 
20 RTTVQTWVDE PADDGRR 

CCGGTTCACCGTGCACACCCGCACCGGCGACGCCCCGTGGACGCTGCACG 3150 

RFTVHTRTGDAPWTLH 
CCGAGGGGGTGCTGCGCCCCCATGGCACGGCCCTGCCCGATGCGGCCGAC 3200 
AEGVLRPHGTALPDAAD 
25 GCCGAGTGGCCCCCACCGGGCGCGGTGCCCGCGGACGGGCTGCCGGGTGT 3250 
AEWPPPGAVPADGLPGV 
GTGGCGCCGGGGGGACCAGGTCTTCGCCGAGGCCGAGGTGGACGGACCGG 3300 

WRRGDQVFAEAEVDGP 
ACGGTTTCGTGGTGCACCCCGACCTGCTCGACGCGGTCTTCTCCGCGGTC 3350 
30 DGFVVHPDLLDAVFSAV 

GGCGACGGAAGCCGCCAGCCGGCCGGATGGCGCGACCTGACGGTGCACGC 3400 

GDGSRQPAGWRDLTVHA 
GTCGGACGCCACCGTACTGCGCGCCTGCCTCACCCGGCGCACCGACGGAG 3450 
SDATVLRACLTRRTDG 
35 CCATGGGATTCGCCGCCTTCGACGGCGCCGGCCTGCCGGTACTCACCGCG 3500 
AMGFAAFDGAGLPVLTA 
GAGGCGGTGACGCTGCGGGAGGTGGCGTCACCGTCCGGCTCCGAGGAGTC 3550 

EAVTLREVAS PSGS EES 
GGACGGCCTGCACCGGTTGGAGTGGCTCGCGGTCGCCGAGGCGGTCTACG 3600 
40 DGLHRLEWLAVAEAVY 

ACGGTGACCTGCCCGAGGGACATGTCCTGATCACCGCCGCCCACCCCGAC 3650 
DGDLPEGHVLITAAHPD 
GACCCCGAGGACATACCCACCCGCGCCCACACCCGCGCCACCCGCGTCCT 3700 
DPEDIPTRAHTRATRVL 
45 GACCGCCCTGCAACACCACCTCACCACCACCGACCACACCCTCATCGTCC 3750 
TALQHHLTTTDHTLIV 
ACACCACCACCGACCCCGCCGGCGCCACCGTCACCGGCCTCACCCGCACC 3800 
HTTTDPAGATVTGLTRT 
GCCCAGAACGAACACCCCCACCGCATCCGCCTCATCGAAACCGACCACCC 3850 
50 AQNEHPHRIRLIETDHP 

CCACACCCCCCTCCCCCTGGCCCAACTCGCCACCCTCGACCACCCCCACC 3900 

HTPLPLAQLATLDHPH 
TCCGCCTCACCCACCACACCCTCCACCACCCCCACCTCACCCCCCTCCAC 3950 
LRLTHHTLHHPHLTPLH 
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ACCACCACCCCACCCACCACCACCCCCCTCAACCCCGAACACGCCATCAT 4000 

TTTPPTTTPLNPEHAII 
CATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGCCACCTGA 4050 
ITGGSGTLAGILARHL 
5 ACCACCCCCACACCTACCTCCTCTCCCGCACCCCACCCCCCGACGCCACC 4100 
NHPHTYLLSRTPPPDAT 
CCCGGCACCCACCTCCCCTGCGACGTCGGCGACCCCCACCAACTCGCCAC 4150 

PGTHLPCDVGDPHQLAT 
CACCCTCACCCACATCCCCCAACCCCTCACCGCCATCTTCCACACCGCCG 4200 
10 TLTHI PQPLTAIFHTA 

CCACCCTCGACGACGGCATCCTCCACGCCCTCACCCCCGACCGCCTCACC 4250 
ATLDDGILHALTPDRLT 
ACCGTCCTCCACCCCAAAGCCAACGCCGCCTGGCACCTGCACCACCTCAC 4300 
TVLH PKANAAWHLHHLT 
1 5 CCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCCGCCGCCG 4350 
QNQPLTHFVLYSSAAA 
TCCTCGGCAGCCCCGGACAAGGAAACTACGCCGCCGCCAACGCCTTCCTC 4400 
VLGSPGQGNYAAANAFL 
GACGCCCTCGCCACCCACCGCCACACCCTCGGCCAACCCGCCACCTCCAT 4 450 
20 DALATHRHTLGQPATS I 

CGCCTGGGGCATGTGGCACACCACCAGCACCCTCACCGGACAACTCGACG 4 500 

AWGMWHTTSTLTGQLD 
ACGCCGACCGGGACCGCATCCGCCGCGGCGGTTTCCTCCCGATCACGGAC 4550 
DADRDRIRRGGFLPITD 
25 GACGAGGGCATGGGGATGCAT 
D E G 

The NhelhXhol restriction fragment that encodes module 8 of the FK-520 PKS 
with the endogenous AT domain replaced by the AT domain of module 12 (specific for 
30 malonyl CoA) of the rapamycin PKS has the DNA sequence and encodes the amino acid 
sequence shown below. 

AGATCTGGCAGCTCGCCGAAGCGCTGCTGACGCTCGTCCGGGAGAGCACC 50 

QLAEALLTLVREST 
GCCGCCGTGCTCGGCCACGTGGGTGGCGAGGACATCCCCGCGACGGCGGC 100 
35 AAVLGHVGGEDI PATAA 

GTTCAAGGACCTCGGCATCGACTCGCTCACCGCGGTCCAGCTGCGCAACG 150 

FKDLGI DSLTAVQLRN 
CCCTCACCGAGGCGACCGGTGTGCGGCTGAACGCCACGGCGGTCTTCGAC 200 
ALTEATGVRLNATAVFD 
40 TTCCCGACCCCGCACGTGCTCGCCGGGAAGCTCGGCGACGAACTGACCGG 250 
FPT PHVLAGKLGDELTG 
CACCCGCGCGCCCGTCGTGCCCCGGACCGCGGCCACGGCCGGTGCGCACG 300 

T RAPVVPRTAATAGAH 
ACGAGCCGCTGGCGATCGTGGGAATGGCCTGCCGGCTGCCCGGCGGGGTC 350 
45 DEPLAIVGMACRLPGGV 

GCGTCACCCGAGGAGCTGTGGCACCTCGTGGCATCCGGCACCGACGCCAT 400 

AS PEELWHLVASGTDAI 
CACGGAGTTCCCGACGGACCGCGGCTGGGACGTCGACGCGATCTACGACC 450 
TEFPTDRGWDVDAIYD 
50 CGGACCCCGACGCGATCGGCAAGACCTTCGTCCGGCACGGTGGCTTCCTC 500 



dc- 176500 



PATENT 

AttyDkt: 300622002600 



-97- 

PDPDAIGKTFVRHGGFL 
ACCGGCGCGACAGGCTTCGACGCGGCGTTCTTCGGCATCAGCCCGCGCGA 550 

TGATGFDAAFFGISPRE 
GGCCCTCGCGATGGACCCGCAGCAGCGGGTGCTCCTGGAGACGTCGTGGG 600 
5 ALAMDPQQRVLLETSW 

AGGCGTTCGAAAGCGCCGGCATCACCCCGGACTCGACCCGCGGCAGCGAC 650 
EAFESAGITPDSTRGS D 
ACCGGCGTGTTCGTCGGCGCCTTCTCCTACGGTTACGGCACCGGTGCGGA 700 
TGVFVGAFSYGYGTGAD 
10 CACCGACGGCTTCGGCGCGACCGGCTCGCAGACCAGTGTGCTCTCCGGCC 750 
TDGFGATGSQTSVLSG 
GGCTGTCGTACTTCTACGGTCTGGAGGGTCCGGCGGTCACGGTCGACACG 800 
RLSYFYGLEGPAVTVDT 
GCGTGTTCGTCGTCGCTGGTGGCGCTGCACCAGGCCGGGCAGTCGCTGCG 850 
15 ACSSSLVALHQAGQSLR 

CTCCGGCGAATGCTCGCTCGCCCTGGTCGGCGGCGTCACGGTGATGGCGT 900 

SGECSLALVGGVTVMA 
CTCCCGGCGGCTTCGTGGAGTTCTCCCGGCAGCGCGGCCTCGCGCCGGAC 950 
SPGGFVEFSRQRGLAPD 
20 GGCCGGGCGAAGGCGTTCGGCGCGGGTGCGGACGGCACGAGCTTCGCCGA 1000 
GRAKAFGAGADGTSFAE 
GGGTGCCGGTGTGCTGATCGTCGAGAGGCTCTCCGACGrCGAACGCAACG 1050 

GAGVL IVERLS DAERN 
GTCACACCGTCCTGGCGGTCGTCCGTGGTTCGGCGGTCAACCAGGATGGT 1100 
25 GHTVLAVVRGSAVNQDG 

GCCTCCAACGGGCTGTCGGCGCCGAACGGGCCGTCGCAGGAGCGGGTGAT 1150 

ASNGLSAPNGPSQERVI 
CCGGCAGGCCCTGGCCAACGCCGGGCTCACCCCGGCGGACGTGGACGCCG 1200 
RQALANAGLT PADVDA 
30 TCGAGGCCCACGGCACCGGCACCAGGCTGGGCGACCCCATCGAGGCACAG 1250 
VEAHGTGTRLGDPI EAQ 
GCGGTACTGGCCACCTACGGACAGGAGCGCGCCACCCCCCTGCTGCTGGG 1300 

AVLATYGQERATPLLLG 
CTCGCTGAAGTCCAACATCGGCCACGCCCAGGCCGCGTCCGGCGTCGCCG 1350 
35 SLKSNIGHAQAASGVA 

GCATCATCAAGATGGTGCAGGCCCTCCGGCACGGGGAGCTGCCGCCGACG 14 00 
GI I KMVQALRHGELPPT 
CTGCACGCCGACGAGCCGTCGCCGCACGTCGACTGGACGGCCGGCGCCGT 1450 
LHADEPSPHVDWTAGAV 
40 CGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGGCCACGGC 1500 
ELLTSARPWPET DRPR 
GTGCCGCCGTCTCCTCGTTCGGGGTGAGCGGCACCAACGCCC ACGTCATC 1550 
RAAVSS FGVSGTNAHVI 
CTGGAGGCCGGACCGGTAACGGAGACGCCCGCGGCATCGCCTTCCGGTGA 1600 
45 LEAGPVTETPAAS PSGD 

CCTTCCCCTGCTGGTGTCGGCACGCTCACCGGAAGCGCTCGACGAGCAGA 1650 

LPLLVSARSPEALDEQ 
TCCGCCGACTGCGCGCCTACCTGGACACCACCCCGGACGTCGACCGGGTG 1700 
1RRLRAYLDTTPDVDRV 
50 GCCGTGGCACAGACGCTGGCCCGGCGCACACACTTCGCCCACCGCGCCGT 1750 
AVAQTLARRT H FAHRAV 
GCTGCTCGGTGACACCGTCATCACCACACCCCCCGCGGACCGGCCCGACG 1800 

LLGDTVITTPPADRPD 
AACTCGTCTTCGTCTACTCCGGCCAGGGCACCCAGCATCCCGCGATGGGC 1850 
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ELVFVYSGQGTQHPAMG 
GAGCAGCTAGCCGCCGCGTTCCCCGTCTTCGCGCGGATCCATCAGCAGGT 1900 

EQLAAAFPVFARIHQQV 
GTGGGACCTGCTCGATGTGCCCGATCTGGAGGTGAACGAGACCGGTTACG 1950 
5 WDLLDVPDLEVNETGY 

CCCAGCCGGCCCTGTTCGCAATGCAGGTGGCTCTGTTCGGGCTGCTGGAA 2000 
AQPALFAMQVALFGLLE 
TCGTGGGGTGTACGACCGGACGCGGTGATCGGCCATTCGGTGGGTGAGCT 2050 
SWGVRPDAVIGHSVGEL 
1 0 TGCGGCTGCGTATGTGTCCGGGGTGTGGTCGTTGGAGGATGCCTGCACTT 2100 
AAAYVSGVWSLEDACT 
TGGTGTCGGCGCGGGCTCGTCTGATGCAGGCTCTGCCCGCGGGTGGGGTG 2150 
LVSARARLMQALPAGGV 
ATGGTCGpTGTCCCGGTCTCGGAGGATGAGGCCCGGGCCGTGCTGGGTGA 2200 
15 MVAVPVSEDEARAVLGE 

GGGTGTGGAGATCGCCGCGGTCAACGGCCCGTCGTCGGTGGTTCTCTCCG 2250 

GVEIAAVNGPSSVVLS 
GTGATGAGGCCGCCGTGCTGCAGGCCGCGGAGGGGCTGGGGAAGTGGACG 2300 
GDEAAVLQAAEGLGKWT 
20 CGGCTGGCGACCAGCCACGCGTTCCATTCCGCCCGTATGGAACCCATGCT 2350 
RLATSHAFHSARMEPML 
GGAGGAGTTCCGGGCGGTCGCCGAAGGCCTGACCTACCGGACGCCGCAGG 2 400 

EE FRAVAEGLTYRTPQ 
TCTCCATGGCCGTTGGTGATCAGGTGACCACCGCTGAGTACTGGGTGCGG 2 450 
25 VSMAVGDQVTTAEYWVR 

CAGGTCCGGGACACGGTCCGGTTCGGCGAGCAGGTGGCCTCGTACGAGGA 2500 

QVRDTVRFGEQVASYED 
CGCCGTGTTCGTCGAGCTGGGTGCCGACCGGTCACTGGCCCGCCTGGTCG 2550 
AVFVELGADRSLARLV 
30 ACGGTGTCGCGATGCTGCACGGCGACCACGAAATCCAGGCCGCGATCGGC 2600 
DGVAMLHGDHEIQAAIG 
GCCCTGGCCCACCTGTATGTCAACGGCGTCACGGTCGACTGGCCCGCGCT 2 650 

ALAHLYVNGVTVDWPAL 
CCTGGGCGATGCTCCGGCAACACGGGTGCTGGACCTTCCGACATACGCCT 2700 
35 LGDAPATRVLDLPTYA 

TCCAGCACCAGCGCTACTGGCTCGAGTCGGCACGCCCGGCCGCATCCGAC 2750 
FQHQRYWLESARPAASD 
GCGGGCCACCCCGTGCTGGGCTCCGGTATCGCCCTCGCCGGGTCGCCGGG 2800 
AGHPVLGSGIALAGSPG 
40 CCGGGTGTTCACGGGTTCCGTGCCGACCGGTGCGGACCGCGCGGTGTTCG 285 0 
RVFTGSVPTGADRAVF 
TCGCCGAGCTGGCGCTGGCCGCCGCGGACGCGGTCGACTGCGCCACGGTC 2900 
VAELALAAADAVDCATV 
GAGCGGCTCGACATCGCCTCCGTGCCCGGCCGGCCGGGCCATGGCCGGAC 2950 
45 ERLDIASVPGRPGHGRT 

GACCGTACAGACCTGGGTCGACGAGCCGGCGGACGACGGCCGGCGCCGGT 3000 

TVQTWVDEPADDGRRR 
TCACCGTGCACACCCGCACCGGCGACGCCCCGTGGACGCTGCACGCCGAG 3050 
FTVHTRTGDAPWTLHAE 
50 GGGGTGCTGCGCCCCCATGGCACGGCCCTGCCCGATGCGGCCGACGCCGA 3100 
GVLRPHGTALPDAADAE 
GTGGCCCCCACCGGGCGCGGTGCCCGCGGACGGGCTGCCGGGTGTGTGGC 3150 

WPPPGAVPADGLPGVW 
GCCGGGGGGACCAGGTCTTCGCCGAGGCCGAGGTGGACGGACCGGACGGT 3200 
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RRGDQVFAEAEVDGPDG 
TTCGTGGTGCACCCCGACCTGCTCGACGCGGTCTTCTCCGCGGTCGGCGA 3250 

FVVHPDLLDAVFSAVGD 
CGGAAGCCGCCAGCCGGCCGGATGGCGCGACCTGACGGTGCACGCGTCGG 3300 
5 GSRQPAGWRDLTVHAS 

ACGCCACCGTACTGCGCGCCTGCCTCACCCGGCGCACCGACGGAGCCATG 3350 
DATVLRACLTRRT DGAM 
GGATTCGCCGCCTTCGACGGCGCCGGCCTGCCGGTACTCACCGCGGAGGC 3400 
GFAAFDGAGLPVLTAEA 
1 0 GGTGACGCTGCGGGAGGTGGCGTCACCGTCCGGCTCCGAGGAGTCGGACG 3450 
VTLREVASPSGSEESD 
GCCTGCACCGGTTGGAGTGGCTCGCGGTCGCCGAGGCGGTCTACGACGGT 3500 
GLHRLEWLAVAEAVYDG 
GACCTGCCCGAGGGACATGTCCTGATCACCGCCGCCCACCCCGACGACCC 3550 
15 DLPEGHVLI TAAHPDDP 

CGAGGACATACCCACCCGCGCCCACACCCGCGCCACCCGCGTCCTGACCG 3600 

EDI PTRAHTRATRVLT 
CCCTGCAACACCACCTCACCACCACCGACCACACCCTCATCGTCCACACC 3650 
ALQHHLTTTDHTLIVHT 
20 ACCACCGACCCCGCCGGCGCCACCGTCACCGGCCTCACCCGCACCGCCCA 3700 
TTDPAGATVTGLTRTAQ 
GAACGAACACCCCCACCGCATCCGCCTCATCGAAACCGACCACCCCCACA 3750 

NEHPHRIRLI ETDHPH 
CCCCCCTCCCCCTGGCCCAACTCGCCACCCTCGACCACCCCCACCTCCGC 3800 
25 TPLPLAQLATLDHPHLR 

CTCACCCACCACACCCTCCACCACCCCCACCTCACCCCCCTCCACACCAC 3850 

LTHHTLHHPHLTPLHTT 
CACCCCACCCACCACCACCCCCCTCAACCCCGAACACGCCATCATCATCA 3900 
TPPTTTPLNPEHAI I I 
30 CCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGCCACCTGAACCAC 3950 
TGGSGTLAGI LARHLNH 
CCCCACACCTACCTCCTCTCCCGCACCCCACCCCCCGACGCCACCCCCGG 4000 

PHTYLLSRTPPPDATPG 
CACCCACCTCCCCTGCGACGTCGGCGACCCCCACCAACTCGCCACCACCC 4050 
35 THLPCDVGDPHQLATT 

TCACCCACATCCCCCAACCCCTCACCGCCATCTTCCACACCGCCGCCACC 4100 
LTHIPQPLTAI FHTAAT 
CTCGACGACGGCATCCTCCACGCCCTCACCCCCGACCGCCTCACCACCGT 4150 
LDDGILHALTPDRLTTV 
40 CCTCCACCCCAAAGCCAACGCCGCCTGGCACCTGCACCACCTCACCCAAA 4200 
LHPKANAAWHLHHLTQ 
ACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCCGCCGCCGTCCTC 4250 
NQPLTHFVLYS SAAAVL 
GGCAGCCCCGGACAAGGAAACTACGCCGCCGCCAACGCCTTCCTCGACGC 4300 
45 GSPGQGNYAAANAFLDA 

CCTCGCCACCCACCGCCACACCCTCGGCCAACCCGCCACCTCCATCGCCT 4350 

LATHRHTLGQPATSIA 
GGGGCATGTGGCACACCACCAGCACCCTCACCGGACAACTCGACGACGCC 44 00 
WGMWHTTSTLTGQLDDA 
50 GACCGGGACCGCATCCGCCGCGGCGGTTTCCTCCCGATCACGGACGACGA 4 450 
DRDRIRRGGFLPITDDE 
GGGCATGGGGATGCAT 
G 
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The Nhell-Xhol restriction fragment that encodes module 8 of the FK-520 PKS 
with the endogenous AT domain replaced by 'the AT domain of module 13 (specific for 
methylmalonyl CoA) of the rapamycin PKS has the DNA sequence and encodes the 
amino acid sequence shown below. 

5 AGATCTGGCAGCTCGCCGAAGCGCTGCTGACGCTCGTCCGGGAGAGCACC 50 
QLAEALLTLVREST 
GCCGCCGTGCTCGGCCACGTGGGTGGCGAGGACATCCCCGCGACGGCGGC 100 

AAVLGHVGGEDI PATAA 
GTTCAAGGACCTCGGCATCGACTCGCTCACCGCGGTCCAGCTGCGCAACG 150 
10 FKDLGIDSLTAVQLRN 

CCCTCACCGAGGCGACCGGTGTGCGGCTGAACGCCACGGCGGTCTTCGAC 200 
ALTEATGVRLNATAVFD 
TTCCCGACCCCGCACGTGCTCGCCGGGAAGCTCGGCGACGAACTGACCGG 250 
FPT PHVLAGKLGDELTG 
15 CACCCGCGCGCCCGTCGTGCCCCGGACCGCGGCCACGGCCGGTGCGCACG 300 
TRAPVVPRTAATAGAH 
ACGAGCCGCTGGCGATCGTGGGAATGGCCTGCCGGCTGCCCGGCGGGGTC 350 
DEPLAIVGMACRLPGGV 
GCGTCACCCGAGGAGCTGTGGCACCTCGTGGCATCCGGCACCGACGCCAT 4 00 
20 AS PEELWHLVASGTDAI 

CACGGAGTTCCCGACGGACCGCGGCTGGGACGTCGACGCGATCTACGACC 4 50 

TEFPTDRGWDVDAIYD 
CGGACCCCGACGCGATCGGCAAGACCTTCGTCCGGCACGGTGGCTTCCTC 500 
PDPDAIGKTFVRHGGFL 
25 ACCGGCGCGACAGGCTTCGACGCGGCGTTCTTCGGCATCAGCCCGCGCGA 550 
TGATGFDAAFFGISPRE 
GGCCCTCGCGATGGACCCGCAGCAGCGGGTGCTCCTGGAGACGTCGTGGG 600 

ALAMDPQQRVLLETSW 
AGGCGTTCGAAAGCGCCGGCATCACCCCGGACTCGACCCGCGGCAGCGAC 650 
30 EAFESAGITPDSTRGSD 

ACCGGCGTGTTCGTCGGCGCCTTCTCCTACGGTTACGGCACCGGTGCGGA 700 

TGVFVGAFSYGYGTGAD 
CACCGACGGCTTCGGCGCGACCGGCTCGCAGACCAGTGTGCTCTCCGGCC 7 50 
TDGFGATGSQTSVLSG 
35 GGCTGTCGTACTTCTACGGTCTGGAGGGTCCGGCGGTCACGGTCGACACG 800 
RLSYFYGLEGPAVTVDT 
GCGTGTTCGTCGTCGCTGGTGGCGCTGCACCAGGCCGGGCAGTCGCTGCG 850 

ACS S SLVALHQAGQSLR 
CTCCGGCGAATGCTCGCTCGCCCTGGTCGGCGGCGTCACGGTGATGGCGT 900 
40 S GECS LALVGGVTVMA 

CTCCCGGCGGCTTCGTGGAGTTCTCCCGGCAGCGCGGCCTCGCGCCGGAC 950 
SPGGFVEFSRQRGLAPD 
GGCCGGGCGAAGGCGTTCGGCGCGGGTGCGGACGGCACGAGCTTCGCCGA 1000 
GRAKAFGAGADGTS FAE 
45 GGGTGCCGGTGTGCTGATCGTCGAGAGGCTCTCCGACGCCGAACGCAACG 1050 
GAGVLIVERLSDAERN 
GTCACACCGTCCTGGCGGTCGTCCGTGGTTCGGCGGTCAACCAGGATGGT 1100 
GHTVLAVVRGSAVNQDG 
GCCTCCAACGGGCTGTCGGCGCCGAACGGGCCGTCGCAGGAGCGGGTGAT 1150 
50 ASNGLSAPNGPSQERVI 
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CCGGCAGGCCCTGGCCAACGCCGGGCTCACCCCGGCGGACGTGGACGCCG 1200 

RQALANAGLTPADVDA 
TCGAGGCCCACGGCACCGGCACCAGGCTGGGCGACCCCATCGAGGCACAG 1250 
VEAHGTGTRLGDPIEAQ 
5 GCGGTACTGGCCACCTACGGACAGGAGCGCGCCACCCCCCTGCTGCTGGG 1300 
AVLATYGQERAT PLLLG 
CTCGCTGAAGTCCAACATCGGCCACGCCCAGGCCGCGTCCGGCGTCGCCG 1350 

SLKSNIGHAQAASGVA 
GCATCATCAAGATGGTGCAGGCCCTCCGGCACGGGGAGCTGCCGCCGACG 14 00 
10 GIIKMVQALRHGELPPT 

CTGCACGCCGACGAGCCGTCGCCGCACGTCGACTGGACGGCCGGCGCCGT 14 50 

LHADE PS PHVDWTAGAV 
CGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGGCCACGGC 1500 
ELLTSARPWPETDRPR 
15 GTGCCGCCGTCTCCTCGTTCGGGGTGAGCGGCACCAACGCCCACGTCATC 1550 
RAAVSSFGVSGTNAHVI 
CTGGAGGCCGGACCGGTAACGGAGACGCCCGCGGCATCGCCTTCCGGTGA 1600 

LEAGPVTETPAASPSGD 
CCTTCCCCTGCTGGTGTCGGCACGCTCACCGGAAGCGCTCGACGAGCAGA 1650 
20 LPLLVSARSPEALDEQ 

TCCGCCGACTGCGCGCCTACCTGGACACCACCCCGGACGTCGACCGGGTG 1700 
I RRLRAYLDTTPDVDRV 
GCCGTGGCACAGACGCTGGCCCGGCGCACACACTTCGCCCACCGCGCCGT 1750 
AVAQTLARRTH FAHRAV 
25 GCTGCTCGGTGACACCGTCATCACCACACCCCCCGCGGACCGGCCCGACG 1800 
LLGDTVI TTPPADRPD 
AACTCGTCTTCGTCTACTCCGGCCAGGGCACCCAGCATCCCGCGATGGGC 1850 
ELVFVYSGQGTQHPAMG 
GAGCAGCTAGCCGATTCGTCGGTGGTGTTCGCCGAGCGGATGGCCGAGTG 1900 
30 EQLADSSVVFAERMAEC 

TGCGGCGGCGTTGCGCGAGTTCGTGGACTGGGATCTGTTCACGGTTCTGG 1950 

AAALREFVDWDLFTVL 
ATGATCCGGCGGTGGTGGACCGGGTTGATGTGGTCCAGCCCGCTTCCTGG 2000 
DDPAVVDRVDVVQPASW 
35 GCGATGATGGTTTCCCTGGCCGCGGTGTGGCAGGCGGCCGGTGTGCGGCC 2050 
AMMVS LAAVWQAAGVRP 
GGATGCGGTGATCGGCCATTCGCAGGGTGAGATCGCCGCAGCTTGTGTGG 2100 

DAVIGHSQGEIAAACV 
CGGGTGCGGTGTCACTACGCGATGCCGCCCGGATCGTGACCTTGCGCAGC 2150 
40 AGAVS LRDAAR I VT L RS 

CAGGCGATCGCCCGGGGCCTGGCGGGCCGGGGCGCGATGGCATCCGTCGC 2200 

QAIARGLAGRGAMASVA 
CCTGCCCGCGCAGGATGTCGAGCTGGTCGACGGGGCCTGGATCGCCGCCC 2250 
LPAQDVELVDGAWIAA 
45 ACAACGGGCCCGCCTCCACCGTGATCGCGGGCACCCCGGAAGCGGTCGAC 2300 
HNGPASTVIAGTPEAVD 
CATGTCCTCACCGCTCATGAGGCACAAGGGGTGCGGGTGCGGCGGATCAC 2350 

HVLTAHEAQGVRVRRIT 
CGTCGACTATGCCTCGCACACCCCGCACGTCGAGCTGATCCGCGACGAAC 2400 
50 VDYASHTPHVELIRDE 

TACTCGACATCACTAGCGACAGCAGCTCGCAGACCCCGCTCGTGCCGTGG 2450 
LLDITSDSSSQTPLVPW 
CTGTCGACCGTGGACGGCACCTGGGTCGACAGCCCGCTGGACGGGGAGTA 2500 
LSTVDGTWVDSPLDGEY 
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CTGGTACCGGAACCTGCGTGAACCGGTCGGTTTCCACCCCGCCGTCAGCC 2550 

WYRNLREPVGFHPAVS 
AGTTGCAGGCCCAGGGCGACACCGTGTTCGTCGAGGTCAGCGCCAGCCCG 2600 
QLQAQGDTVFVEVSASP 
5 GTGTTGTTGCAGGCGATGGACGACGATGTCGTCACGGTTGCCACGCTGCG 2650 
VLLQAMDDDVVTVATLR 
TCGTGACGACGGCGACGCCACCCGGATGCTCACCGCCCTGGCACAGGCCT 27 0 0 

RDDGDATRMLTALAQA 
ATGTCCACGGCGTCACCGTCGACTGGCCCGCCATCCTCGGCACCACCACA 2750 
10 YVHGVTVDWPAILGTTT 

ACCCGGGTACTGGACCTTCCGACCTACGCCTTCCAACACCAGCGGTACTG 2800 

TRVLDLPTYAFQHQRYW 
GCTCGAGTCGGCACGCCCGGCCGCATCCGACGCGGGCCACCCCGTGCTGG 28 50 
LESARPAASDAGHPVL 
15 GCTCCGGTATCGCCCTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTTCC 2 900 
GSGIALAGSPGRVFTGS 
GTGCCGACCGGTGCGGACCGCGCGGTGTTCGTCGCCGAGCTGGCGCTGGC 2950 

VPTGADRAVFVAELALA 
CGCCGCGGACGCGGTCGACTGCGCCACGGTCGAGCGGCTCGACATCGCCT 3000 
20 AADAVDCATVERLDIA 

CCGTGCCCGGCCGGCCGGGCCATGGCCGGACGACCGTACAGACCTGGGTC 3050 
SVPGRPGHGRTTVQTWV 
GACGAGCCGGCGGACGACGGCCGGCGCCGGTTCACCGTGCACACCCGCAC 3100 
DEPADDGRRRFTVHTRT 
25 CGGCGACGCCCCGTGGACGCTGCACGCCGAGGGGGTGCTGCGCCCCCATG 3150 
GDAPWTLHAEGVLRPH 
GCACGGCCCTGCCCGATGCGGCCGACGCCGAGTGGCCCCCACCGGGCGCG 3200 
GTALPDAADAEWPPPGA 
GTGCCCGCGGACGGGCTGCCGGGTGTGTGGCGCCGGGGGGACCAGGTCTT 3250 
30 VPADGLPGVWRRGDQVF 

CGCCGAGGCCGAGGTGGACGGACCGGACGGTTTCGTGGTGCACCCCGACC 3300 

AEAEVDGPDGFVVHPD 
TGCTCGACGCGGTCTTCTCCGCGGTCGGCGACGGAAGCCGCCAGCCGGCC 3350 
LLDAVFSAVGDGSRQPA 
35 GGATGGCGCGACCTGACGGTGCACGCGTCGGACGCCACCGTACTGCGCGC 34 00 
GWRDLTVHAS DATVLRA 
CTGCCTCACCCGGCGCACCGACGGAGCCATGGGATTCGCCGCCTTCGACG 3450 

CLT RRTDGAMGFAAFD 
GCGCCGGCCTGCCGGTACTCACCGCGGAGGCGGTGACGCTGCGGGAGGTG 3500 
40 GAGLPVLTAEAVTLREV 

GCGTCACCGTCCGGCTCCGAGGAGTCGGACGGCCTGCACCGGTTGGAGTG 3550 

ASPSGSEESDGLHRLEW 
GCTCGCGGTCGCCGAGGCGGTCTACGACGGTGACCTGCCCGAGGGACATG 3600 
LAVAEAVYDGDL PEGH 
45 TCCTGATCACCGCCGCCCACCCCGACGACCCCGAGGACATACCCACCCGC 3650 
VLITAAHPDDPEDIPTR 
GCCCACACCCGCGCCACCCGCGTCCTGACCGCCCTGCAACACCACCTCAC 3700 

AHTRATRVLTALQHHLT 
CACCACCGACCACACCCTCATCGTCCACACCACCACCGACCCCGCCGGCG 375 0 
50 TTDHTLIVHTTTDPAG 

CCACCGTCACCGGCCTCACCCGCACCGCCCAGAACGAACACCCCCACCGC 3800 
ATVTGLTRTAQNEHPHR 
ATCCGCCTCATCGAAACCGACCACCCCCACACCCCCCTCCCCCTGGCCCA 3850 
IRLIETDHPHTPLPLAQ 
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ACTCGCCACCCTCGACCACCCCCACCTCCGCCTCACCCACCACACCCTCC 3900 

LATLDHPHLRLTHHTL 
ACCACCCCCACCTCACCCCCCTCCACACCACCACCCCACCCACCACCACC 3950 
HHPHLTPLHTTTPPTTT 
5 CCCCTCAACCCCGAACACGCCATCATCATCACCGGCGGCTCCGGCACCCT 4 000 
PLNPEHAI I ITGGSGTL 
CGCCGGCATCCTCGCCCGCCACCTGAACCACCCCCACACCTACCTCCTCT 4 050 

AGILARHLNHPHTYLL 
CCCGCACCCCACCCCCCGACGCCACCCCCGGCACCCACCTCCCCTGCGAC 4100 
10 SRTPPPDATPGTHLPCD 

GTCGGCGACCCCCACCAACTCGCCACCACCCTCACCCACATCCCCCAACC 4 150 

VGDPHQLATTLTHIPQP 
CCTCACCGCCATCTTCCACACCGCCGCCACCCTCGACGACGGCATCCTCC 4 200 
LTAIFHTAATLDDGIL 
15 ACGCCCTCACCCCCGACCGCCTCACCACCGTCCTCCACCCCAAAGCCAAC 4250 
HALT PDRLTTVLH PKAN 
GCCGCCTGGCACCTGCACCACCTCACCCAAAACCAACCCCTCACCCACTT 4 300 

AAWHLHHLTQNQPLTHF 
CGTCCTCTACTCCAGCGCCGCCGCCGTCCTCGGCAGCCCCGGACAAGGAA 4 350 
20 VLYSSAAAVLGS PGQG 

ACTACGCCGCCGCCAACGCCTTCCTCGACGCCCTCGCCACCCACCGCCAC 4 400 
NYAAANAFLDALATHRH 
ACCCTCGGCCAACCCGCCACCTCCATCGCCTGGGGCATGTGGCACACCAC 4 450 
TLGQPATSIAWGMWHTT 
25 CAGCACCCTCACCGGACAACTCGACGACGCCGACCGGGACCGCATCCGCC 4 500 
STLTGQLDDADRDRIR 
GCGGCGGTTTCCTCCCGATCACGGACGACGAGGGCATGGGGATGCAT 
RGGFLPITDDEG 

30 Phage KC5 15 DNA was prepared using the procedure described in Genetic 

Manipulation of Streptomyces, A Laboratory Manual, edited by D. Hopwood et al A 
phage suspension prepared from 10 plates (100 mm) of confluent plaques of KC515 on S. 
Hvidans TK24 generally gave about 3 \i% of phage DNA. The DNA was ligated to 
circularize at the cos site, subsequently digested with restriction enzymes BamHl and 

35 Pstl, and dephosphorylated with SAP. 

Each module 8 cassette described above was excised with restriction enzymes 
BgUl and Nsil and ligated into the compatible BamHl and Pstl sites of KC515 phage 
DNA prepared as described above. The ligation mixture containing KC515 and various 
cassettes was transfected into protoplasts of Streptomyces Hvidans TK24 using the 

40 procedure described in Genetic Manipulation of Streptomyces, A Laboratory Manual 
edited by D. Hopwood et aL and overlaid with TK24 spores. After 16-24 hr ? the plaques 
were restreaked on plates overlaid with TK24 spores. Single plaques were picked and 
resuspended in 200 (iL of nutrient broth. Phage DNA was prepared by the boiling method 
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(Hopwood et al., supra). The PCR with primers spanning the left and right boundaries of 
the recombinant phage was used to verify the correct phage had been isolated. In most 
cases, at least 80% of the plaques contained the expected insert. To confirm the presence 
of the resistance marker (thiostrepton), a spot test is used, as described in Lomovskaya et 
5 ah (1997), in which a plate with spots of phage is overlaid with mixture of spores of 
TK24 and phiC31 TK24 lysogen. After overnight incubation, the plate is overlaid with 
antibiotic in soft agar. A working stock is made of all phage containing desired 
constructs. 

Streptomyces hygroscopicus ATCC 14891 (see US Patent No. 3,244,592, issued 

10 5 Apr 1966, incorporated herein by reference) mycelia were infected with the 

recombinant phage by mixing the spores and phage (1 x 10 s of each), and incubating on 
R2YE agar (Genetic Manipulation of Streptomyces, A Laboratory Manual, edited by D. 
Hopwood et aL) at 30°C for 10 days. Recombinant clones were selected and plated on 
minimal medium containing thiostrepton (50 jxg/ml) to select for the thiostrepton 

1 5 resistance-conferring gene. Primary thiostrepton resistant clones were isolated and 
purified through a second round of single colony isolation, as necessary. To obtain 
thiostrepton-sensitive revertants that underwent a second recombination event to evict the 
phage genome, primary recombinants were propagated in liquid media for two to three 
days in the absence of thiostrepton and then spread on agar medium without thiostrepton 

20 to obtain spores. Spores were plated to obtain about 50 colonies per plate, and 
thiostrepton sensitive colonies were identified by replica plating onto thiostrepton 
containing agar medium. The PCR was used to determine which of the thiostrepton 
sensitive colonies reverted to the wild type (reversal of the initial integration event), and 
which contain the desired AT swap at module 8 in the ATCC 14891 -derived cells. The 

25 PCR primers used amplified either the KS/AT junction or the AT/DH junction of the 
wild-type and the desired recombinant strains. Fermentation of the recombinant strains, 
followed by isolation of the metabolites and analysis by LCMS, and NMR is used to 
characterize the novel polyketide compounds. 

30 
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Example 2 

Replacement of Methoxyl with Hydrogen or Methyl at C-13 of FK-506 
The present invention also provides the 13-desmethoxy derivatives of FK-506 and 
the novel PKS enzymes that produce them. A variety of Streptomyces strains that produce 

5 FK-506 are known in the art, including S. tsukubaensis No. 9993 (FERM BP-927), 
described in U.S. Patent No. 5,624,852, incorporated herein by reference; S. 
hygroscopicus subsp.yakushimaensisNo. 7238, described in U.S. patent No. 4,894,366, 
incorporated herein by reference; S. sp. MA6858 (ATCC 55098), described in U.S. 
Patent Nos. 5,1 16,756, incorporated herein by reference; and S. sp. MA 6548, described 

10 in Motamedi et ah, 1998, "The biosynthetic gene cluster for the macrolactone ring of the 
immunosuppressant FK-506," Eur. 1 Biochem. 256: 528-534, and Motamedi et al 9 1997, 
"Structural organization of a multifunctional polyketide synthase involved in the 
biosynthesis of the macrolide immunosuppressant FK-506," Eur. J. Biochem. 244: 74-80, 
each of which is incorporated herein by reference. 

1 5 The complete sequence of the FK-506 gene cluster from Streptomyces sp. 

MA6548 is known, and the sequences of the corresponding gene clusters from other FK- 
506-producing organisms is highly homologous thereto. The novel FK-506 recombinant 
gene clusters of the present invention differ from the naturally occurring gene clusters in 
that the AT domain of module 8 of the naturally occurring PKSs is replaced by an AT 

20 domain specific for malonyl CoA or methylmalonyl CoA. These AT domain 

replacements are made at the DNA level, following the methodology described in 
Example 1 . 

The naturally occurring module 8 sequence for the MA6548 strain is shown 
below, followed by the illustrative hybrid module 8 sequences for the MA6548 strains. 

25 GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTGGTGGTG 50 
MRLYEAARRTGS PVVV 
GCGGCCGCGCTCGACGACGCGCCGGACGTGCCGCTGCTGCGCGGGCTGCG 100 

AAALDDAPDVPLLRGLR 
GCGTACGACCGTCCGGCGTGCCGCCGTCCGGGAACGCTCTCTCGCCGACC 150 
30 RTTVRRAAVRERSLAD 

GCTCGCCGTGCTGCCCGACGACGAGCGCGCCGACGCCTCCCTCGCGTTCG 200 
RSPCCPTTSAPTPPSRS 
TCCTGGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 250 
SWNSTATVLGHLGAEDI 
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CCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 

PATTTFKELGI DSLTA 
TCCAGCTGCGCAACGCGCTGACCACGGCGACCGGCGTACGCCTCAACGCC 350 
VQLRNALTTATGVRLNA 
5 ACAGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 4 00 

TAVFDFPT PRALAARLG 
CGACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 4 50 

DELAGTRAPVAARTAA 
CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 
10 TAAAHDE PLAIVGMACR 

CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 

LPGGVASPQELWRLVAS 
CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 
GTDAI TEFPADRGWDV 
15 ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 

DALYDPDPDAIGKTFVR 
CACGGCGGCTTCCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 700 

HGGFLDGATGFDAAFFG 
GATCAGCCCGCGCGAGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 750 
20 ISPREALAMDPQQRVL 

TGGAGACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 800 
LETSWEAFESAGI T PDA 
GCGCGGGGCAGCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGTA 850 
ARGSDTGVFIGAFSYGY 
25 CGGCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 

GTGADTNGFGATGSQT 
GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 95 0 
SVLSGRLSYFYGLEGPS 
GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 
30 VTVDTACSSSLVALHQA 

AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 

GQSLRSGECSLALVGG 
TCACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100 
VTVMASPGGFVEFSRQR 
35 GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 

GLAPDGRAKAFGAGADG 
TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 

TSFAEGAGALVVERLS 
ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 1250 
40 DAERHGHTVLALVRGSA 

GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 

ANSDGASNGLSAPNGPS 
CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 
QERVIHQALANAKLTP 
45 CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 14 00 

ADVDAVEAHGTGTRLGD 
CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 1450 

P I EAQALLATYGQDRAT 
GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 
50 PLLLGSLKSNIGHAQA 

CGTCAGGGGTCGCCGGGATCATCAAGATGGTGCAGGCCATCCGGCACGGG 1550 
ASGVAGI IKMVQAI RHG 
GAACTGCCGCCGACACTGCACGCGGACGAGCCGTCGCCGCACGTCGACTG 1600 
ELPPTLHADEPSPHVDW 
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GACGGCCGGTGCCGTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA 1650 

TAGAVELLTSARPWPG 
CCGGTCGCCCGCGCCGCGCTGCCGTCTCGTCGTTCGGCGTGAGCGGCACG 1700 
TGRPRRAAVSS FGVSGT 
5 AACGCCCACATCATCCTTGAGGCAGGACCGGTCAAAACGGGACCGGTCGA 1750 

NAHI ILEAGPVKTGPVE 
GGCAGGAGCGATCGAGGCAGGACCGGTCGAAGTAGGACCGGTCGAGGCTG 1800 

AGAI EAGPVEVGPVEA 
GACCGCTCCCCGCGGCGCCGCCGTCAGCACCGGGCGAAGACCTTCCGCTG 1850 
10 GPLPAAPPSAPGEDLPL 

CTCGTGTCGGCGCGTTCCCCGGAGGCACTCGACGAGCAGATCGGGCGCCT 1900 

LVSARS PEALDEQI GRL 
GCGCGCCTATCTCGACACCGGCCCGGGCGTCGACCGGGCGGCCGTGGCGC 1950 
RAYLDTGP GVDRAAVA 
1 5 AGACACTGGCCCGGCGTACGCACTTCACCCACCGGGCCGTACTGCTCGGG 2000 

QTLARRTHFTHRAVLLG 
GACACCGTCATCGGCGCTCCCCCCGCGGACCAGGCCGACGAACTCGTCTT 2050 

DTVIGAPPADQADELVF 
CGTCTACTCCGGTCAGGGCACCCAGCATCCCGCGATGGGCGAGCAACTCG 2100 
20 VYSGQGTQHPAMGEQL 

CGGCCGCGTTCCCCGTGTTCGCCGATGCCTGGCACGACGCGCTCCGACGG 2150 
AAAFPVFADAWHDALRR 
CTCGACGACCCCGACCCGCACGACCCCACACGGAGCCAGCACACGCTCTT 2200 
LDDPDPHDPTRSQHTLF 
25 CGCCCACCAGGCGGCGTTCACCGCCCTCCTGAGGTCCTGGGACATCACGC 2250 

AHQAAFTALLRSWDIT 
CGCACGCCGTCATCGGCCACTCGCTCGGCGAGATCACCGCCGCGTACGCC 2300 
PHAVI GHSLGE ITAAYA 
GCCGGGATCCTGTCGCTCGACGACGCCTGCACCCTGATCACCACGCGTGC 2350 
30 AGILSLDDACTLITTRA 

CCGCCTCATGCACACGCTTCCGCCGCCCGGCGCCATGGTCACCGTGCTGA 2400 

RLMHTLPPPGAMVTVL 
CCAGCGAGGAGGAGGCCCGTCAGGCGCTGCGGCCGGGCGTGGAGATCGCC 2450 
TSEEEARQALRPGVEIA 
35 GCGGTCTTCGGCCCGCACTCCGTCGTGCTCTCGGGCGACGAGGACGCCGT 2500 

AVFGPHSVVLSGDEDAV 
GCTCGACGTCGCACAGCGGCTCGGCATCCACCACCGTCTGCCCGCGCCGC 2550 

LDVAQRLG I HHRLPAP 
ACGCGGGCCACTCCGCGCACATGGAACCCGTGGCCGCCGAGCTGCTCGCC 2 600 
40 HAGHSAHMEPVAAELLA 

ACCACTCGCGAGCTCCGTTACGACCGGCCCCACACCGCCATCCCGAACGA 2 650 

TTRELRYDRPHTAI PND 
CCCCACCACCGCCGAGTACTGGGCCGAGCAGGTCCGCAACCCCGTGCTGT 2700 
PTTAEYWAEQVRNPVL 
45 TCCACGCCCACACCCAGCGGTACCCCGACGCCGTGTTCGTCGAGATCGGC 2750 

FHAHTQRYPDAVFVEIG 
CCCGGCCAGGACCTCTCACCGCTGGTCGACGGCATCGCCCTGCAGAACGG 2800 

PGQDLSPLVDGIALQNG 
CACGGCGGACGAGGTGCACGCGCTGCACACCGCGCTCGCCCGCCTCTTCA 2850 
50 TADEVHALHTALARLF 

CACGCGGCGCCACGCTCGACTGGTCCCGCATCCTCGGCGGTGCTTCGCGG 2900 
TRGATLDWSRI LGGASR 
CACGACCCTGACGTCCCCTCGTACGCGTTCCAGCGGCGTCCCTACTGGAT 2 950 
HDPDVPSYAFQRRPYWI 
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CGAGTCGGCTCCCCCGGCCACGGCCGACTCGGGCCACCCCGTCCTCGGCA 3000 

ESAPPATADSGHPVLG 
CCGGAGTCGCCGTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTCCCGTG 3050 
TGVAVAGS PGRVFTGPV 
5 CCCGCCGGTGCGGACCGCGCGGTGTTCATCGCCGAACTGGCGCTCGCCGC 3100 

PAGADRAVFIAELALAA 
CGCCGACGCCACCGACTGCGCCACGGTCGAACAGCTCGACGTCACCTCCG 3150 

ADAT DCATVEQLDVTS 
TGCCCGGCGGATCCGCCCGCGGCAGGGCCACCGCGCAGACCTGGGTCGAT 3200 
10 VPGGSARGRATAQTWVD 

GAACCCGCCGCCGACGGGCGGCGCCGCTTCACCGTCCACACCCGCGTCGG 3250 

EPAADGRRRFTVHTRVG 
CGACGCCCCGTGGACGCTGCACGCCGAGGGGGTTCTCCGCCCCGGCCGCG 3300 
DAPWT LHAEGVLRPGR 
15 TGCCCCAGCCCGAAGCCGTCGACACCGCCTGGCCCCCGCCGGGCGCGGTG 3350 

VPQPEAVDTAWPPPGAV 
CCCGCGGACGGGCTGCCCGGGGCGTGGCGACGCGCGGACCAGGTCTTCGT 34 00 

PADGLPGAWRRADQVFV 
CGAAGCCGAAGTCGACAGCCCTGACGGCTTCGTGGCACACCCCGACCTGC 34 50 
20 EAEVDSPDGFVAHPDL 

TCGACGCGGTCTTCTCCGCGGTCGGCGACGGGAGCCGCCAGCCGACCGGA 35 00 
LDAVFSAVGDGSRQPTG 
TGGCGCGACCTCGCGGTGCACGCGTCGGACGCCACCGTGCTGCGCGCCTG 3550 
WRDLAVHASDATVLRAC 
25 CCTCACCCGCCGCGACAGTGGTGTCGTGGAGCTCGCCGCCTTCGACGGTG 3600 

LTRRDSGVVELAAFDG 
CCGGAATGCCGGTGCTCACCGCGGAGTCGGTGACGCTGGGCGAGGTCGCG 3650 
AGMPVLTAESVTLGEVA 
TCGGCAGGCGGATCCGACGAGTCGGACGGTCTGCTTCGGCTTGAGTGGTT 37 00 
30 SAGGSDESDGLLRLEWL 

GCCGGTGGCGGAGGCCCACTACGACGGTGCCGACGAGCTGCCCGAGGGCT 37 50 

PVAEAHYDGADELPEG 
ACACCCTCATCACCGCCACACACCCCGACGACCCCGACGACCCCACCAAC 3800 
YTLITATHPDDPDDPTN 
35 CCCCACAACACACCCACACGCACCCACACACAAACCACACGCGTCCTCAC 3850 

PHNT PTRTHTQTTRVLT 
CGCCCTCCAACACCACCTCATCACCACCAACCACACCCTCATCGTCCACA 3900 

ALQHHLITTNHTLIVH 
CCACCACCGACCCCCCAGGCGCCGCCGTCACCGGCCTCACCCGCACCGCA 3950 
40 TTT DPPGAAVTGLTRTA 

CAAAACGAACACCCCGGCCGCATCCACCTCATCGAAACCCACCACCCCCA 4000 

QNEHPGRIHLIETHHPH 
CACCCCACTCCCCCTCACCCAACTCACCACCCTCCACCAACCCCACCTAC 4050 
TPLPLTQLTTLHQPHL 
45 GCCTCACCAACAACACCCTCCACACCCCCCACCTCACCCCCATCACCACC 4100 

RLTNNTLHTPHLTPITT 
CACCACAACACCACCACAACCACCCCCAACACCCCACCCCTCAACCCCAA 4150 

HHNTTTTTPNTPPLNPN 
CCACGCCATCCTCATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCG 42 00 
50 HAILITGGSGTLAGIL 

CCCGCCACCTCAACCACCCCCACACCTACCTCCTCTCCCGCACACCACCA 4250 
ARHLNHPHTYLLSRTPP 
CCCCCCACCACACCCGGCACCCACATCCCCTGCGACCTCACCGACCCCAC 4300 
PPTTPGTHIPCDLTDPT 
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CCAAATCACCCAAGCCCTCACCCACATACCACAACCCCTCACCGGCATCT 4350 

QITQALTHIPQPLTGI 
TCCACACCGCCGCCACCCTCGACGACGCCACCCTCACCAACCTCACCCCC 4 4 00 
FHTAATLDDATLTNLTP 
5 CAACACCTCACCACCACCCTCCAACCCAAAGCCGACGCCGCCTGGCACCT 4 450 

QHLTTTLQPKADAAWHL 
CCACCACCACACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCA 4500 

HHHTQNQPLTHFVLYS 
GCGCCGCCGCCACCCTCGGCAGCCCCGGCCAAGCCAACTACGCCGCCGCC 4 550 
10 SAAATLGS PGQANYAAA 

AACGCCTTCCTCGACGCCCTCGCCACCCACCGCCACACCCAAGGACAACC 4 600 

NAFLDALATHRHTQGQP 
C G C C AC C AC CAT CG C C T G G GG C AT G T G G C AC AC C AC C ACC AC AC T C AC C A 4 650 
ATT IAWGMWHTTTTLT 
1 5 GCCAACTCACCGACAGCGACCGCGACCGCATCCGCCGCGGCGGCTTCCTG 4700 

SQLTDS DRDRIRRGGFL 
CCGATCTCGGACGACGAGGGCATGC 
PISDDEGM 

20 The AvrWXhol hybrid FK-506 PKS module 8 containing the AT domain of 

module 12 of rapamycin is shown below. 

GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTGGTGGTG 50 

MRLYEAARRTGS PVVV 
GCGGCCGCGCTCGACGACGCGCCGGACGTGCCGCTGCTGCGCGGGCTGCG 100 
25 AAALDDAPDVPLLRGLR 

GCGTACGACCGTCCGGCGTGCCGCCGTCCGGGAACGCTCTCTCGCCGACC 150 

RTTVRRAAVRERSLAD 
GCTCGCCGTGCTGCCCGACGACGAGCGCGCCGACGCCTCCCTCGCGTTCG 200 
RSPCCPTTSAPTPPSRS 
30 TCCTGGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 250 
SWNSTATVLGHLGAEDI 
CCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 

PATTTFKELGIDSLTA 
TCCAGCTGCGCAACGCGCTGACCACGGCGACCGGCGTACGCCTCAACGCC 350 
35 VQLRNALTTATGVRLNA 

ACAGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 4 00 

TAVFDFPT PRALAARLG 
CGACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 450 
DELAGTRAPVAARTAA 
40 CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 
TAAAH DE PLAIVGMACR 
CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 

LPGGVAS PQELWRLVAS 
CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 
45 GTDAITEFPADRGWDV 

ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 
DALYDPDPDAIGKTFVR 
CACGGCGGCTTCCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 700 
HGGFLDGATGFDAAFFG 
50 GATCAGCCCGCGCGAGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 750 
I SPREALAMDPQQRVL 
TGGAGACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 800 
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LETSWEAFESAGI T PDA 
GCGCGGGGCAGCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGTA 850 

ARGSDTGVFIGAFSYGY 
CGGCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 
5 GTGADTNGFGATGSQT 

GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
SVLSGRLSYFYGLEGPS 
GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 
VTVDTACSSSLVALHQA 
1 0 AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 
GQSLRSGECSLALVGG 
TCACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100 
VTVMASPGGFVEFSRQR 
GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 
15 GLAPDGRAKAFGAGADG 

TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 12 00 

TSFAEGAGALVVERLS 
ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 12 50 
DAERHGHTVLALVRGSA 
20 GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 
ANSDGASNGLSAPNGPS 
CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 

QERVI HQALANAKLT P 
CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 14 00 
25 ADVDAVEAHGTGT RLGD 

CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 14 50 

PIEAQALLATYGQDRAT 
GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 
PLLLGSLKSNIGHAQA 
3 0 CGTCAGGGGTCGCCGGGATCATCAAGATGGTGCAGGCCATCCGGCACGGG 1550 
ASGVAGI IKMVQAI RHG 
GAACTGCCGCCGACACTGCACGCGGACGAGCCGTCGCCGCACGTCGACTG 1600 

ELPPTLHADEPS PHVDW 
GACGGCCGGTGCCGTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA 1650 
35 TAGAVELLTSARPWPG 

CCGGTCGCCCTAGGCGGGCAGGCGTGTCGTCCTTCGGGATCAGTGGCACC 1700 
TGRPRRAGVSSFGISGT 
AACGCCCACGTCATCCTGGAAAGCGCACCCCCCACTCAGCCTGCGGACAA 17 50 
NAHVILESAPPTQPADN 
40 CGCGGTGATCGAGCGGGCACCGGAGTGGGTGCCGTTGGTGATTTCGGCCA 18 00 
AVIERAPEWVPLVI SA 
GGACCCAGTCGGCTTTGACTGAGCACGAGGGCCGGTTGCGTGCGTATCTG 1850 
RTQSALTEHEGRLRAYL 
GCGGCGTCGCCCGGGGTGGATATGCGGGCTGTGGCATCGACGCTGGCGAT 1900 
45 AAS PGVDMRAVAS T LAM 

GACACGGTCGGTGTTCGAGCACCGTGCCGTGCTGCTGGGAGATGACACCG 1950 

TRSVFEHRAVLLGDDT 
TCACCGGCACCGCTGTGTCTGACCCTCGGGCGGTGTTCGTCTTCCCGGGA 2000 
VTGTAVSDPRAVFVFPG 
50 CAGGGGTCGCAGCGTGCTGGCATGGGTGAGGAACTGGCCGCCGCGTTCCC 2050 
QGSQRAGMGEELAAAFP 
CGTCTTCGCGCGGATCCATCAGCAGGTGTGGGACCTGCTCGATGTGCCCG 2100 

VFARIHQQVWDLLDVP 
ATCTGGAGGTGAACGAGACCGGTTACGCCCAGCCGGCCCTGTTCGCAATG 2150 
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DLEVNET GYAQPALFAM 
CAGGTGGCTCTGTTCGGGCTGCTGGAATCGTGGGGTGTACGACCGGACGC 2200 

QVALFGLLESWGVRPDA 
GGTGATCGGCCATTCGGTGGGTGAGCTTGCGGCTGCGTATGTGTCCGGGG 2250 
5 VI GHSVGELAAAYVSG 

TGTGGTCGTTGGAGGATGCCTGCACTTTGGTGTCGGCGCGGGCTCGTCTG 2300 
VWSLEDACTLVSARARL 
ATGCAGGCTCTGCCCGCGGGTGGGGTGATGGTCGCTGTCCCGGTCTCGGA 2350 
MQALPAGGVMVAVPVSE 
1 0 GGATGAGGCCCGGGCCGTGCTGGGTGAGGGTGTGGAGATCGCCGCGGTCA 24 00 
DEARAVLGEGVE IAAV 
ACGGCCCGTCGTCGGTGGTTCTCTCCGGTGATGAGGCCGCCGTGCTGCAG 24 50 
NGPSSVVLSGDEAAVLQ 
GCCGCGGAGGGGCTGGGGAAGTGGACGCGGCTGGCGACCAGCCACGCGTT 2500 
15 AAEGLGKWTRLATSHAF 

CCATTCCGCCCGTATGGAACCCATGCTGGAGGAGTTCCGGGCGGTCGCCG 2550 

HSARME PMLEEFRAVA 
AAGGCCTGACCTACCGGACGCCGCAGGTCTCCATGGCCGTTGGTGATCAG 2600 
EGLTYRTPQVSMAVGDQ 
20 GTGACCACCGCTGAGTACTGGGTGCGGCAGGTCCGGGACACGGTCCGGTT 2650 
VTTAEYWVRQVRDTVRF 
CGGCGAGCAGGTGGCCTCGTACGAGGACGCCGTGTTCGTCGAGCTGGGTG 2700 

GEQVAS YEDAVFVELG 
CCGACCGGTCACTGGCCCGCCTGGTCGACGGTGTCGCGATGCTGCACGGC 27 50 
25 ADRSLARLVDGVAMLHG 

GACCACGAAATCCAGGCCGCGATCGGCGCCCTGGCCCACCTGTATGTCAA 2800 

DHEIQAAI GALAHLYVN 
CGGCGTCACGGTCGACTGGCCCGCGCTCCTGGGCGATGCTCCGGCAACAC 2850 
GVTVDWPALLGDAPAT 
30 GGGTGCTGGACCTTCCGACATACGCCTTCCAGCACCAGCGCTACTGGCTC 2900 
RVLDLPTYAFQHQRYWL 
GAGTCGGCTCCCCCGGCCACGGCCGACTCGGGCCACCCCGTCCTCGGCAC 2950 

ESAPPATADSGHPVLGT 
CGGAGTCGCCGTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTCCCGTGC 3000 
35 GVAVAGS PGRVFTGPV 

CCGCCGGTGCGGACCGCGCGGTGTTCATCGCCGAACTGGCGCTCGCCGCC 3050 
PAGADRAVF IAELALAA 
GCCGACGCCACCGACTGCGCCACGGT CG AACAGCTCGACGTCACCTCCGT 3100 
ADATDCATVEQLDVTSV 
40 GCCCGGCGGATCCGCCCGCGGCAGGGCCACCGCGCAGACCTGGGTCGATG 3150 
PGGSARGRATAQTWVD 
AACCCGCCGCCGACGGGCGGCGCCGCTTCACCGTCCACACCCGCGTCGGC 3200 
EPAADGRRRFTVHTRVG 
GACGCCCCGTGGACGCTGCACGCCGAGGGGGTTCTCCGCCCCGGCCGCGT 3250 
45 DAPWTLHAEGVLRPGRV 

GCCCCAGCCCGAAGCCGTCGACACCGCCTGGCCCCCGCCGGGCGCGGTGC 3300 

PQPEAVDTAWPPPGAV 
CCGCGGACGGGCTGCCCGGGGCGTGGCGACGCGCGGACCAGGTCTTCGTC 3350 
PADGLPGAWRRADQVFV 
50 GAAGCCGAAGTCGACAGCCCTGACGGCTTCGTGGCACACCCCGACCTGCT 3400 
EAEVDS PDGFVAHPDLL 
CGACGCGGTCTTCTCCGCGGTCGGCGACGGGAGCCGCCAGCCGACCGGAT 3450 

DAVFSAVGDGSRQPTG 
GGCGCGACCTCGCGGTGCACGCGTCGGACGCCACCGTGCTGCGCGCCTGC 3500 
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WRDLAVHAS DATVLRAC 
CTCACCCGCCGCGACAGTGGTGTCGTGGAGCTCGCCGCCTTCGACGGTGC 3550 

LTRRDSGVVELAAFDGA 
CGGAATGCCGGTGCTCACCGCGGAGTCGGTGACGCTGGGCGAGGTCGCGT 3600 
5 GMPVLTAESVTLGEVA 

CGGCAGGCGGATCCGACGAGTCGGACGGTCTGCTTCGGCTTGAGTGGTTG 3650 
SAGGSDESDGLLRLEWL 
CCGGTGGCGGAGGCCCACTACGACGGTGCCGACGAGCTGCCCGAGGGCTA 37 00 
PVAEAHYDGADELPEGY 
1 0 CACCCTCATCACCGCCACACACCCCGACGACCCCGACGACCCCACCAACC 3750 
TLITATHPDDPDDPTN 
CCCACAACACACCCACACGCACCCACACACAAACCACACGCGTCCTCACC 3800 
PHNTPTRTHTQTTRVLT 
GCCCTCCAACACCACCTCATCACCACCAACCACACCCTCATCGTCCACAC 3850 
15 ALQHHLITTNHTLIVHT 

CACCACCGACCCCCCAGGCGCCGCCGTCACCGGCCTCACCCGCACCGCAC 3900 

TTDPPGAAVTGLTRTA 
AAAACGAACACCCCGGCCGCATCCACCTCATCGAAACCCACCACCCCCAC 3950 
QNEHPGRIHLIETHHPH 
20 ACCCCACTCCCCCTCACCCAACTCACCACCCTCCACCAACCCCACCTACG 4000 
TPLPLTQLTTLHQPHLR 
CCTCACCAACAACACCCTCCACACCCCCCACCTCACCCCCATCACCACCC 4050 

LTNNTLHTPHLTPITT 
ACCACAACACCACCACAACCACCCCCAACACCCCACCCCTCAACCCCAAC 4100 
25 HHNTTTTTPNTPPLNPN 

CACGCCATCCTCATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGC 4150 

HAILITGGSGTLAGILA 
CCGCCACCTCAACCACCCCCACACCTACCTCCTCTCCCGCACACCACCAC 4200 
RHLNHPHTYLLSRTPP 
30 CCCCCACCACACCCGGCACCCACATCCCCTGCGACCTCACCGACCCCACC 4250 
PPTTPGTHIPCDLTDPT 
CAAATCACCCAAGCCCTCACCCACATACCACAACCCCTCACCGGCATCTT 4300 

QITQALTHIPQPLTGIF 
CCACACCGCCGCCACCCTCGACGACGCCACCCTCACCAACCTCACCCCCC 4 350 
35 HTAATLDDATLTNLTP 

AACACCTCACCACCACCCTCCAACCCAAAGCCGACGCCGCCTGGCACCTC 4 4 00 
QHLTTTLQPKADAAWHL 
CACCACCACACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAG 4450 
HHHTQNQPLTHFVLYSS 
40 CGCCGCCGCCACCCTCGGCAGCCCCGGCCAAGCCAACTACGCCGCCGCCA 4500 
AAATLGS PGQANYAAA 
ACGCCTTCCTCGACGCCCTCGCCACCCACCGCCACACCCAAGGACAACCC 4 550 
NAFLDALATHRHTQGQP 
GCCACCACCATCGCCTGGGGCATGTGGCACACCACCACCACACTCACCAG 4 600 
45 ATTIAWGMWHTTTTLTS 

CCAACTCACCGACAGCGACCGCGACCGCATCCGCCGCGGCGGCTTCCTGC 4 650 

QLTDSDRDRIRRGGFL 
CGATCTCGGACGACGAGGGCATGC 
PISDDEGM 

50 

The Avrll-Xhol hybrid FK-506 PKS module 8 containing the AT domain of 
module 13 of rapamycin is shown below. 
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GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTGGTGGTG 5 0 

MRLYEAARRTGS PVVV 
GCGGCCGCGCTCGACGACGCGCCGGACGTGCCGCTGCTGCGCGGGCTGCG 100 
AAALDDAPDVPLLRGLR 
5 GCGTACGACCGTCCGGCGTGCCGCCGTCCGGGAACGCTCTCTCGCCGACC 150 
RTTVRRAAVRERS LAD 
GCTCGCCGTGCTGCCCGACGACGAGCGCGCCGACGCCTCCCTCGCGTTCG 200 
RSPCCPTTSAPTPPSRS 
TCCTGGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 250 
10 SWNSTATVLGHLGAEDI 

CCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 

PATTTFKELGI DS LTA 
TCCAGCTGCGCAACGCGCTGACCACGGCGACCGGCGTACGCCTCAACGCC 350 
VQLRNALTTATGVRLNA 
15 ACAGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 4 00 
TAVFDFPTPRALAARLG 
CGACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 4 5 0 

DELAGTRAPVAARTAA 
CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 
20 TAAAHDE PLAIVGMACR 

CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 

LPGGVAS PQELWRLVAS 
CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 
GTDAITEFPADRGWDV 
25 ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 
DALYDPDPDAIGKT FVR 
CACGGCGGCTTCCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 7 00 

HGGFLDGATGFDAAFFG 
GATCAGCCCGCGCGAGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 7 50 
30 ISPREALAMDPQQRVL 

TGGAGACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 800 
LETSWEAFESAGI TPDA 
GCGCGGGGCAGCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGTA 8 50 
ARGSDTGVFIGAFSYGY 
35 CGGCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 
GTGADTNGFGATGSQT 
GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
SVLSGRLSYFYGLEGPS 
GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 
40 VTVDTACSSSLVALHQA 

AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 

GQSLRSGECSLALVGG 
TCACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100 
VTVMASPGGFVEFS RQR 
45 GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 
GLAPDGRAKAFGAGADG 
TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 

TSFAEGAGALVVERLS 
ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 1250 
50 DAERHGHTVLALVRGSA 

GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 

ANSDGASNGLSAPNGPS 
CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 
QERVIHQALANAKLT P 
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CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 14 00 
ADVDAVEAHGTGTRLGD 
CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 1450 
P I EAQALLATYGQDRAT 
5 GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 
PLLLGSLKSNIGHAQA 
CGTCAGGGGTCGCCGGGATCATCAAGATGGTGCAGGCCATCCGGCACGGG 1550 
ASGVAGI I KMVQAIRHG 
GAACTGCCGCCGACACTGCACGCGGACGAGCCGTCGCCGCACGTCGACTG 1600 
10 ELPPTLHADEPSPHVDW 

GACGGCCGGTGCCGTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA 1650 

TAGAVELLTSARPWPG 
CCGGTCGCCCTAGGCGGGCGGGCGTGTCGTCCTTCGGAGTCAGCGGCACC 1700 
TGRPRRAGVSSFGVSGT 
1 5 AACGCCCACGTCATCCTGGAGAGCGCACCCCCCGCTCAGCCCGCGGAGGA 17 50 
NAHVILESAPPAQPAEE 
GGCGCAGCCTGTTGAGACGCCGGTGGTGGCCTCGGATGTGCTGCCGCTGG 1800 

AQPVETPVVASDVLPL 
TGATATCGGCCAAGACCCAGCCCGCCCTGACCGAACACGAAGACCGGCTG 1850 
20 VISAKTQPALTEHEDRL 

CGCGCCTACCTGGCGGCGTCGCCCGGGGCGGATATACGGGCTGTGGCATC 1900 

RAYLAAS PGADI RAVAS 
GACGCTGGCGGTGACACGGTCGGTGTTCGAGCACCGCGCCGTACTCCTTG 1950 
TLAVTRSVFEHRAVLL 
25 GAGATGACACCGTCACCGGCACCGCGGTGACCGACCCCAGGATCGTGTTT 2000 
GDDTVTGTAVTDPRIVF 
GTCTTTCCCGGGCAGGGGTGGCAGTGGCTGGGGATGGGCAGTGCACTGCG 2 050 

VFPGQGWQWLGMGSALR 
CGATTCGTCGGTGGTGTTCGCCGAGCGGATGGCCGAGTGTGCGGCGGCGT 2100 
30 DS SVVFAERMAECAAA 

TGCGCGAGTTCGTGGACTGGGATCTGTTCACGGTTCTGGATGATCCGGCG 2150 
LREFVDWDLFTVLDDPA 
GTGGTGGACCGGGTTGATGTGGTCCAGCCCGCTTCCTGGGCGATGATGGT 22 00 
VVDRVDVVQPASWAMMV 
35 TTCCCTGGCCGCGGTGTGGCAGGCGGCCGGTGTGCGGCCGGATGCGGTGA 2250 
S LAAVWQAAGVRP DAV 
TCGGCCATTCGCAGGGTGAGATCGCCGCAGCTTGTGTGGCGGGTGCGGTG 2300 
IGHSQGE IAAACVAGAV 
TCACTACGCGATGCCGCCCGGATCGTGACCTTGCGCAGCCAGGCGATCGC 2350 
40 SLRDAARIVTLRSQAIA 

CCGGGGCCTGGCGGGCCGGGGCGCGATGGCATCCGTCGCCCTGCCCGCGC 24 00 

RGLAGRGAMASVALPA 
AGGATGTCGAGCTGGTCGACGGGGCCTGGATCGCCGCCCACAACGGGCCC 2450 
QDVELVDGAWIAAHNGP 
45 GCCTCCACCGTGATCGCGGGCACCCCGGAAGCGGTCGACCATGTCCTCAC 2500 
ASTVIAGTPEAVDHVLT 
CGCTCATGAGGCACAAGGGGTGCGGGTGCGGCGGATCACCGTCGACTATG 2550 

AHEAQGVRVRRITVDY 
CCTCGCACACCCCGCACGTCGAGCTGATCCGCGACGAACTACTCGACATC 2 600 
50 ASHTPHVELIRDELLDI 

ACTAGCGACAGCAGCTCGCAGACCCCGCTCGTGCCGTGGCTGTCGACCGT 2650 

TSDSSSQTPLVPWLSTV 
GGACGGCACCTGGGTCGACAGCCCGCTGGACGGGGAGTACTGGTACCGGA 2700 
DGTWVDS PLDGEYWYR 
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ACCTGCGTGAACCGGTCGGTTTCCACCCCGCCGTCAGCCAGTTGCAGGCC 2750 
NLRE PVGFHPAVSQLQA 
CAGGGCGACACCGTGTTCGTCGAGGTCAGCGCCAGCCCGGTGTTGTTGCA 2800 
QGDTVFVEVSASPVLLQ 
5 GGCGATGGACGACGATGTCGTCACGGTTGCCACGCTGCGTCGTGACGACG 2850 
AMDDDVVTVATLRRDD 
GCGACGCCACCCGGATGCTCACCGCCCTGGCACAGGCCTATGTCCACGGC 2 900 
GDATRMLTALAQAYVHG 
GTCACCGTCGACTGGCCCGCCATCCTCGGCACCACCACAACCCGGGTACT 2 950 
10 VTVDWPAILGTTTTRVL 

GGACCTTCCGACCTACGCCTTCCAACACCAGCGGTACTGGCTCGAGTCGG 3000 

DLPTYAFQHQRYWLES 
CTCCCCCGGCCACGGCCGACTCGGGCCACCCCGTCCTCGGCACCGGAGTC 3050 
APPATADSGHPVLGTGV 
15 GCCGTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTCCCGTGCCCGCCGG 3100 
AVAGS PGRVFTGPVPAG 
TGCGGACCGCGCGGTGTTCATCGCCGAACTGGCGCTCGCCGCCGCCGACG 3150 

ADRAVFIAELALAAAD 
CCACCGACTGCGCCACGGTCGAACAGCTCGACGTCACCTCCGTGCCCGGC 3200 
20 AT DCATVEQLDVTSVPG 

GGATCCGCCCGCGGCAGGGCCACCGCGCAGACCTGGGTCGATGAACCCGC 3250 

GSARGRATAQTWVDEPA 
CGCCGACGGGCGGCGCCGCTTCACCGTCCACACCCGCGTCGGCGACGCCC 3300 
ADGRRRFTVHTRVG DA 
25 CGTGGACGCTGCACGCCGAGGGGGTTCTCCGCCCCGGCCGCGTGCCCCAG 3350 
PWTLHAEGVLRPGRVPQ 
CCCGAAGCCGTCGACACCGCCTGGCCCCCGCCGGGCGCGGTGCCCGCGGA 34 00 

PEAVDTAWPPPGAV PAD 
CGGGCTGCCCGGGGCGTGGCGACGCGCGGACCAGGTCTTCGTCGAAGCCG 3450 
30 GLPGAWRRADQVFVEA 

AAGTCGACAGCCCTGACGGCTTCGTGGCACACCCCGACCTGCTCGACGCG 3500 
EVDS PDGFVAHPDLLDA 
GTCTTCTCCGCGGTCGGCGACGGGAGCCGCCAGCCGACCGGATGGCGCGA 3550 
VFSAVGDGSRQPTGWRD 
35 CCTCGCGGTGCACGCGTCGGACGCCACCGTGCTGCGCGCCTGCCTCACCC 3600 
LAVHASDATVLRACLT 
GCCGCGACAGTGGTGTCGTGGAGCTCGCCGCCTTCGACGGTGCCGGAATG 3650 
RRDSGVVELAAFDGAGM 
CCGGTGCTCACCGCGGAGTCGGTGACGCTGGGCGAGGTCGCGTCGGCAGG 3700 
40 PVLTAESVTLGEVASAG 

CGGATCCGACGAGTCGGACGGTCTGCTTCGGCTTGAGTGGTTGCCGGTGG 3750 

GSDESDGLLRLEWLPV 
CGGAGGCCCACTACGACGGTGCCGACGAGCTGCCCGAGGGCTACACCCTC 3800 
AEAHYDGADELPEGYTL 
45 ATCACCGCCACACACCCCGACGACCCCGACGACCCCACCAACCCCCACAA 3850 
I TATH PDDPDDPTNPHN 
CACACCCACACGCACCCACACACAAACCACACGCGTCCTCACCGCCCTCC 3900 

TPTRTHTQTTRVLTAL 
AACACCACCTCATCACCACCAACCACACCCTCATCGTCCACACCACCACC 3950 
50 QHHLI TTNHTLIVHTTT 

GACCCCCCAGGCGCCGCCGTCACCGGCCTCACCCGCACCGCACAAAACGA 4 000 

DPPGAAVTGLTRTAQNE 
ACACCCCGGCCGCATCCACCTCATCGAAACCCACCACCCCCACACCCCAC 4050 
HPGRIHLIETHHPHTP 
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TCCCCCTCACCCAACTCACCACCCTCCACCAACCCCACCTACGCCTCACC 4100 
LPLTQLTTLHQPHLRLT 
AACAACACCCTCCACACCCCCCACCTCACCCCCATCACCACCCACCACAA 4150 
NNTLHTPHLTPITTHHN 
5 CACCACCACAACCACCCCCAACACCCCACCCCTCAACCCCAACCACGCCA 4200 
TTTTTPNTPPLNPNHA 
TCCTCATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGCCAC 4250 
ILITGGSGTLAGILARH 
CTCAACCACCCCCACACCTACCTCCTCTCCCGCACACCACCACCCCCCAC 4300 
10 LNHPHTYLLSRTPPPPT 

CACACCCGGCACCCACATCCCCTGCGACCTCACCGACCCCACCCAAATCA 4350 

TPGTHIPCDLTDPTQI 
CCCAAGCCCTCACCCACATACCACAACCCCTCACCGGCATCTTCCACACC 44 00 
TQALTHI PQPLTGIFHT 
1 5 GCCGCCACCCTCGACGACGCCACCCTCACCAACCTCACCCCCCAACACCT 44 50 
AATLDDATLTNLTPQHL 
CACCACCACCCTCCAACCCAAAGCCGACGCCGCCTGGCACCTCCACCACC 4500 

TTTLQPKADAAWHLHH 
ACACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCCGCC 4 550 
20 HTQNQPLTHFVLYSSAA 

GCCACCCTCGGCAGCCCCGGCCAAGCCAACTACGCCGCCGCCAACGCCTT 4 600 

ATLGS PGQANYAAANAF 
CCTCGACGCCCTCGCCACCCACCGCCACACCCAAGGACAACCCGCCACCA 4 600 
LDALATHRHTQGQPAT 
25 CCATCGCCTGGGGCATGTGGCACACCACCACCACACTCACCAGCCAACTC 4 7 00 
TIAWGMWHTTTTLTSQL 
ACCGACAGCGACCGCGACCGCATCCGCCGCGGCGGCTTCCTGCCGATCTC 4 750 

TDSDRDRIRRGGFLPIS 
GGACGACGAGGGCATGC 
30 D D E G M 

The Nhel-Xhol hybrid FK-506 PKS module 8 containing the AT domain of 
module 12 of rapamycin is shown below. 

GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTGGTGGTG 5 0 
35 MRLYEAARRTGSPVVV 

GCGGCCGCGCTCGACGACGCGCCGGACGTGCCGCTGCTGCGCGGGCTGCG 100 

AAAL DDAPDVPLLRGLR 
GCGTACGACCGTCCGGCGTGCCGCCGTCCGGGAACGCTCTCTCGCCGACC 150 
RTTVRRAAVRERSLAD 
40 GCTCGCCGTGCTGCCCGACGACGAGCGCGCCGACGCCTCCCTCGCGTTCG 200 
RSPCCPTTSAPTPPSRS 
TCCTGGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 250 

SWNS TATVLGHLGAEDI 
CCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 
45 PATTTFKELGIDSLTA 

TCCAGCTGCGCAACGCGCTGACCACGGCGACCGGCGTACGCCTCAACGCC 350 
VQLRNALTTATGVRLNA 
ACAGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 400 
TAVFDFPTPRALAARLG 
50 CGACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 450 
DELAGTRAPVAARTAA 
CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 
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TAAAHDEPLAIVGMACR 
CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 

LPGGVASPQELWRLVAS 
CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 
5 GTDAI TEFPADRGWDV 

ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 
DALYDPDPDAIGKT FVR 
CACGGCGGCTTCCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 7 00 
HGGFLDGATGFDAAFFG 
1 0 GATCAGCCCGCGCGAGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 7 50 
ISPREALAMDPQQRVL 
TGGAGACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 8 00 
LETSWEAFESAGITPDA 
GCGCGGGGCAGCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGTA 850 
15 ARGSDTGVFIGAFSYGY 

CGGCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 

GTGADTNGFGATGSQT 
GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
SVLSGRLSYFYGLEGPS 
20 GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 
VTVDTACSSSLVALHQA 
AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 

GQSLRSGECSLALVGG 
TCACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100 
25 VTVMAS PGGFVEFSRQR 

GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 

GLAPDGRAKAFGAGADG 
TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 
TSFAEGAGALVVERLS 
30 ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 1250 
DAERHGHTVLALVRGSA 
GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 

ANSDGASNGLSAPNGPS 
CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 
35 QERVI HQALANAKLT P 

CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 14 00 
ADVDAVEAHGTGTRLGD 
CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 14 50 
P I EAQALLATYGQDRAT 
40 GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 
PLLLGSLKSNIGHAQA 
CGTCAGGGGTCGCCGGGATCATCAAGATGGTGCAGGCCATCCGGCACGGG 1550 
ASGVAGI IKMVQAIRHG 
GAACTGCCGCCGACACTGCACGCGGACGAGCCGTCGCCGCACGTCGACTG 1600 
45 ELPPTLHADEPSPHVDW 

GACGGCCGGTGCCGTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA 1650 

TAGAVELLTSARPWPG 
CCGGTCGCCCGCGCCGCGCTGCCGTCTCGTCGTTCGGCGTGAGCGGCACG 1700 
TGRPRRAAVSSFGVSGT 
50 AACGCCCACATCATCCTTGAGGCAGGACCGGTCAAAACGGGACCGGTCGA 1750 
NAHI ILEAGPVKTGPVE 
GGCAGGAGCGATCGAGGCAGGACCGGTCGAAGTAGGACCGGTCGAGGCTG 1800 

AGAI EAGPVEVGPVEA 
GACCGCTCCCCGCGGCGCCGCCGTCAGCACCGGGCGAAGACCTTCCGCTG 1850 
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gplpaappsapgedl.pl 
ctcgtgtcggcgcgttccccggaggcactcgacgagcagatcgggcgcct 1900 
lvsarspealdeqigrl 
gcgcgcctatctcgacaccggcccgggcgtcgaccgggcggccgtggcgc 1950 
5 rayldtg pgvdraava 

agacactggcccggcgtacgcacttcacccaccgggccgtactgctcggg 2000 

QTLARRTHFTHRAVLLG 
GACACCGTCATCGGCGCTCCCCCCGCGGACCAGGCCGACGAACTCGTCTT 2050 
DTVIGAPPADQADELVF 
10 CGTCTACTCCGGTCAGGGCACCCAGCATCCCGCGATGGGCGAGCAGCTAG 2100 
VYSGQGTQHPAMGEQL 
CCGCCGCGTTCCCCGTCTTCGCGCGGATCCATCAGCAGGTGTGGGACCTG 2150 
AAAFPVFARI HQQVWDL 
CTCGATGTGCCCGATCTGGAGGTGAACGAGACCGGTTACGCCCAGCCGGC 2200 
15 LDVPDLEVNETGYAQPA 

CCTGTTCGCAATGCAGGTGGCTCTGTTCGGGCTGCTGGAATCGTGGGGTG 2250 

LFAMQVALFGLLESWG 
TACGACCGGACGCGGTGATCGGCCATTCGGTGGGTGAGCTTGCGGCTGCG 2 300 
VRPDAVI GHSVGELAAA 
20 TATGTGTCCGGGGTGTGGTCGTTGGAGGATGCCTGCACTTTGGTGTCGGC 2350 
YVSGVWSLEDACTLVSA 
GCGGGCTCGTCTGATGCAGGCTCTGCCCGCGGGTGGGGTGATGGTCGCTG 2400 

RARLMQAL PAGGVMVA 
TCCCGGTCTCGGAGGATGAGGCCCGGGCCGTGCTGGGTGAGGGTGTGGAG 2450 
25 VPVSEDEARAVLGEGVE 

ATCGCCGCGGTCAACGGCCCGTCGTCGGTGGTTCTCTCCGGTGATGAGGC 2500 

IAAVNGPSSVVLSGDEA 
CGCCGTGCTGCAGGCCGCGGAGGGGCTGGGGAAGTGGACGCGGCTGGCGA 2 550 
AVLQAAEGLGKWTRLA 
30 CCAGCCACGCGTTCCATTCCGCCCGTATGGAACCCATGCTGGAGGAGTTC 2600 
TSHAFHSARME PMLEEF 
CGGGCGGTCGCCGAAGGCCTGACCTACCGGACGCCGCAGGTCTCCATGGC 2 650 

RAVAEGLTYRT PQVSMA 
CGTTGGTGATCAGGTGACCACCGCTGAGTACTGGGTGCGGCAGGTCCGGG 2700 
35 VGDQVTTAEYWVRQVR 

ACACGGTCCGGTTCGGCGAGCAGGTGGCCTCGTACGAGGACGCCGTGTTC 2750 
DTVRFGEQVAS YEDAVF 
GTCGAGCTGGGTGCCGACCGGTCACTGGCCCGCCTGGTCGACGGTGTCGC 2 800 
VELGADRSLARLVDGVA 
40 GATGCTGCACGGCGACCACGAAATCCAGGCCGCGATCGGCGCCCTGGCCC 2850 
MLHGDHEIQAAIGALA 
ACCTGTATGTCAACGGCGTCACGGTCGACTGGCCCGCGCTCCTGGGCGAT 2900 
HLYVNGVTVDWPALLGD 
GCTCCGGCAACACGGGTGCTGGACCTTCCGACATACGCCTTCCAGCACCA 2950 
45 APATRVLDLPTYAFQHQ 

GCGCTACTGGCTCGAGTCGGCTCCCCCGGCCACGGCCGACTCGGGCCACC 3000 

RYWLESAPPATADSGH 
CCGTCCTCGGCACCGGAGTCGCCGTCGCCGGGTCGCCGGGCCGGGTGTTC 3050 
PVLGTGVAVAGS PGRVF 
50 ACGGGTCCCGTGCCCGCCGGTGCGGACCGCGCGGTGTTCATCGCCGAACT 3100 
TGPVPAGADRAVFIAEL 
GGCGCTCGCCGCCGCCGACGCCACCGACTGCGCCACGGTCGAACAGCTCG 3150 

ALAAADAT DCATVEQL 
ACGTCACCTCCGTGCCCGGCGGATCCGCCCGCGGCAGGGCCACCGCGCAG 3200 
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DVT SVPGGSARGRATAQ 
ACCTGGGTCGATGAACCCGCCGCCGACGGGCGGCGCCGCTTCACCGTCCA 3250 

TWVDEPAADGRRRFTVH 
CACCCGCGTCGGCGACGCCCCGTGGACGCTGCACGCCGAGGGGGTTCTCC 3300 
5 TRVGDAPWTLHAEGVL 

GCCCCGGCCGCGTGCCCCAGCCCGAAGCCGTCGACACCGCCTGGCCCCCG 3350 
RPGRVPQPEAVDTAWPP 
CCGGGCGCGGTGCCCGCGGACGGGCTGCCCGGGGCGTGGCGACGCGCGGA 34 00 
PGAVPADGLPGAWRRAD 
1 0 CCAGGTCTTCGTCGAAGCCGAAGTCGACAGCCCTGACGGCTTCGTGGCAC 3450 
QVFVEAEVDSPDGFVA 
ACCCCGACCTGCTCGACGCGGTCTTCTCCGCGGTCGGCGACGGGAGCCGC 3500 
HPDLLDAVFSAVGDGSR 
CAGCCGACCGGATGGCGCGACCTCGCGGTGCACGCGTCGGACGCCACCGT 3550 
15 QPTGWRDLAVHAS DATV 

GCTGCGCGCCTGCCTCACCCGCCGCGACAGTGGTGTCGTGGAGCTCGCCG 3600 

LRACLTRRDSGVVELA 
CCTTCGACGGTGCCGGAATGCCGGTGCTCACCGCGGAGTCGGTGACGCTG 3 650 
AFDGAGMPVLTAESVTL 
20 GGCGAGGTCGCGTCGGCAGGCGGATCCGACGAGTCGGACGGTCTGCTTCG 3700 
GEVASAGGSDES DGLLR 
GCTTGAGTGGTTGCCGGTGGCGGAGGCCCACTACGACGGTGCCGACGAGC 3750 

LEWLPVAEAHYDGADE 
TGCCCGAGGGCTACACCCTCATCACCGCCACACACCCCGACGACCCCGAC 38 00 
25 LPEGYTLITATHPDDPD 

GACCCCACCAACCCCCACAACACACCCACACGCACCCACACACAAACCAC 3850 

DPTNPHNTPTRTHTQTT 
ACGCGTCCTCACCGCCCTCCAACACCACCTCATCACCACCAACCACACCC 3900 
RVLTALQHHLITTNHT 
30 TCATCGTCCACACCACCACCGACCCCCCAGGCGCCGCCGTCACCGGCCTC 3950 
L IVHTTTDPPGAAVTGL 
ACCCGCACCGCACAAAACGAACACCCCGGCCGCATCCACCTCATCGAAAC 4000 

TRTAQNEHPGRIHLIET 
CCACCACCCCCACACCCCACTCCCCCTCACCCAACTCACCACCCTCCACC 4050 
35 HHPHTPLPLTQLTTLH 

AACCCCACCTACGCCTCACCAACAACACCCTCCACACCCCCCACCTCACC 4100 
QPHLRLTNNTLHT PHLT 
CCCATCACCACCCACCACAACACCACCACAACCACCCCCAACACCCCACC 4150 
PITTHHNTTTTTPNTPP 
40 CCTCAACCCCAACCACGCCATCCTCATCACCGGCGGCTCCGGCACCCTCG 4200 
LNPNHAILITGGSGTL 
CCGGCATCCTCGCCCGCCACCTCAACCACCCCCACACCTACCTCCTCTCC 4250 
AGILARHLNHPHTYLLS 
CGCACACCACCACCCCCCACCACACCCGGCACCCACATCCCCTGCGACCT 4300 
45 RTPPPPTTPGTHI PCDL 

CACCGACCCCACCCAAATCACCCAAGCCCTCACCCACATACCACAACCCC 4350 

TDPTQITQALTH I PQP 
TCACCGGCATCTTCCACACCGCCGCCACCCTCGACGACGCCACCCTCACC 4 400 
LTGI FHTAATLDDATLT 
50 AACCTCACCCCCCAACACCTCACCACCACCCTCCAACCCAAAGCCGACGC 4 450 
NLTPQHLTTTLQPKADA 
CGCCTGGCACCTCCACCACCACACCCAAAACCAACCCCTCACCCACTTCG 4500 

AWHLHHHTQNQPLTHF 
TCCTCTACTCCAGCGCCGCCGCCACCCTCGGCAGCCCCGGCCAAGCCAAC 4 55 0 
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VLYSSAAATLGSPGQAN 
TACGCCGCCGCCAACGCCTTCCTCGACGCCCTCGCCACCCACCGCCACAC 4 600 

YAAANAFL DALATHRHT 
CCAAGGACAACCCGCCACCACCATCGCCTGGGGCATGTGGCACACCACCA 4 650 
5 QGQPATT IAWGMWHTT 

CCACACTCACCAGCCAACTCACCGACAGCGACCGCGACCGCATCCGCCGC 4700 

TTLTSQLTDSDRDRIRR 

GGCGGCTTCCTGCCGATCTCGGACGACGAGGGCATGC 

GGFLPISDDEGM 

10 

The Nhel-Xhol hybrid FK-506 PKS module 8 containing the AT domain of 
module 13 of rapamycin is shown below. 

GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTGGTGGTG 5 0 
MRLYEAARRTGSPVVV 
15 GCGGCCGCGCTCGACGACGCGCCGGACGTGCCGCTGCTGCGCGGGCTGCG 100 
AAALDDAPDVPLLRGLR 
GCGTACGACCGTCCGGCGTGCCGCCGTCCGGGAACGCTCTCTCGCCGACC 150 

RTTVRRAAVRERSLAD 
GCTCGCCGTGCTGCCCGACGACGAGCGCGCCGACGCCTCCCTCGCGTTCG 200 
20 RSPCCPTTSAPTPPSRS 

TCCTGGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 250 

SWNSTATVLGHLGAEDI 
CCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 
PATTTFKELGIDSLTA 
25 TCCAGCTGCGCAACGCGCTGACCACGGCGACCGGCGTACGCCTCAACGCC 350 
VQLRNALTTATGVRLNA 
ACAGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 4 00 

TAVFDFPTPRALAARLG 
CGACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 4 50 
30 DELAGTRAPVAARTAA 

CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 
TAAAHDE PLAIVGMACR 
CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 
LPGGVAS PQELWRLVAS 
35 CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 
GT DAI TEFPADRGWDV 
ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 
DALYDPDPDAIGKTFVR 
CACGGCGGCTTCCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 7 00 
40 HGGFLDGATGFDAAFFG 

GATCAGCCCGCGCGAGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 750 

I SPREALAMDPQQRVL 
TGGAGACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 800 
LETSWEAFESAGITPDA 
45 GCGCGGGGCAGCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGTA 850 
ARGS DTGVFIGAFSYGY 
CGGCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 

GTGADTNGFGATGSQT 
GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
50 SVLSGRLSYFYGLEGPS 

GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 
VTVDTACSSSLVALHQA 
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AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 

GQSLRSGECSLALVGG 
TCACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100 
VTVMAS PGGFVEFS RQR 
5 GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 
GLAPDGRAKAFGAGADG 
TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 

TSFAEGAGALVVERLS 
ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 1250 
10 DAERHGHTVLALVRGSA 

GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 

ANSDGASNGLSAPNGPS 
CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 
QERVIHQALANAKLTP 
1 5 CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 14 00 
ADVDAVEAHGTGTRLGD 
CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 14 50 

PIEAQALLATYGQDRAT 
GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 
20 PLLLGSLKSNIGHAQA 

CGTCAGGGGTCGCCGGGATCATCAAGATGGTGCAGGCCATCCGGCACGGG 1550 
ASGVAGI IKMVQAI RHG 
GAACTGCCGCCGACACTGCACGCGGACGAGCCGTCGCCGCACGTCGACTG 1600 
ELPPTLHADEPS PHVDW 
25 GACGGCCGGTGCCGTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA 1650 
TAGAVELLTSARPWPG 
CCGGTCGCCCGCGCCGCGCTGCCGTCTCGTCGTTCGGCGTGAGCGGCACG 17 00 
TGRPRRAAVSSFGVSGT 
AACGCCCACATCATCCTTGAGGCAGGACCGGTCAAAACGGGACCGGTCGA 17 50 
30 NAHIILEAGPVKTGPVE 

GGCAGGAGCGATCGAGGCAGGACCGGTCGAAGTAGGACCGGTCGAGGCTG 1800 

AGAIEAGPVEVGPVEA 
GACCGCTCCCCGCGGCGCCGCCGTCAGCACCGGGCGAAGACCTTCCGCTG 1850 
GPLPAAPPSAPGEDLPL 
35 CTCGTGTCGGCGCGTTCCCCGGAGGCACTCGACGAGCAGATCGGGCGCCT 1900 
LVSARSPEALDEQIGRL 
GCGCGCCTATCTCGACACCGGCCCGGGCGTCGACCGGGCGGCCGTGGCGC 1950 

RAYLDTGPGVDRAAVA 
AGACACTGGCCCGGCGTACGCACTTCACCCACCGGGCCGTACTGCTCGGG 2000 
40 QTLARRTHFTHRAVLLG 

GACACCGTCATCGGCGCTCCCCCCGCGGACCAGGCCGACGAACTCGTCTT 2050 

DTVIGAPPADQADELVF 
CGTCTACTCCGGTCAGGGCACCCAGCATCCCGCGATGGGCGAGCAGCTAG 2100 
VY'SGQGTQH PAMGEQL 
45 CCGATTCGTCGGTGGTGTTCGCCGAGCGGATGGCCGAGTGTGCGGCGGCG 2150 
ADS SVVFAERMAECAAA 
TTGCGCGAGTTCGTGGACTGGGATCTGTTCACGGTTCTGGATGATCCGGC 2200 

LREFVDWDLFTVLDDPA 
GGTGGTGGACCGGGTTGATGTGGTCCAGCCCGCTTCCTGGGCGATGATGG 2250 
50 VVDRVDVVQPASWAMM 

TTTCCCTGGCCGCGGTGTGGCAGGCGGCCGGTGTGCGGCCGGATGCGGTG 2300 
VSLAAVWQAAGVRP DAV 
ATCGGCCATTCGCAGGGTGAGATCGCCGCAGCTTGTGTGGCGGGTGCGGT 2350 
I GHSQGEIAAACVAGAV 
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GTCACTACGCGATGCCGCCCGGATCGTGACCTTGCGCAGCCAGGCGATCG 2400 

SLRDAARIVTLRSQAI 
CCCGGGGCCTGGCGGGCCGGGGCGCGATGGCATCCGTCGCCCTGCCCGCG 2450 
ARGLAGRGAMASVALPA 
5 CAGGATGTCGAGCTGGTCGACGGGGCCTGGATCGCCGCCCACAACGGGCC 2500 
QDVELVDGAWIAAHNGP 
CGCCTCCACCGTGATCGCGGGCACCCCGGAAGCGGTCGACCATGTCCTCA 2550 

ASTVIAGT PEAVDHVL 
CCGCTCATGAGGCACAAGGGGTGCGGGTGCGGCGGATCACCGTCGACTAT 2600 
10 TAHEAQGVRVRRI TVDY 

GCCTCGCACACCCCGCACGTCGAGCTGATCCGCGACGAACTACTCGACAT 2 650 

ASHTPHVELIRDELLDI 
CACTAGCGACAGCAGCTCGCAGACCCCGCTCGTGCCGTGGCTGTCGACCG 27 00 
TSDSSSQTPLVPWLST 
1 5 TGGACGGCACCTGGGTCGACAGCCCGCTGGACGGGGAGTACTGGTACCGG 2750 
VDGTWVDSPLDGEYWYR 
AACCTGCGTGAACCGGTCGGTTTCCACCCCGCCGTCAGCCAGTTGCAGGC 2800 

NLREPVGFH PAVSQLQA 
CCAGGGCGACACCGTGTTCGTCGAGGTCAGCGCCAGCCCGGTGTTGTTGC 2850 
20 QGDTVFVEVSASPVLL 

AGGCGATGGACGACGATGTCGTCACGGTTGCCACGCTGCGTCGTGACGAC 2 900 
QAMDDDVVTVATLRRDD 
GGCGACGCCACCCGGATGCTCACCGCCCTGGCACAGGCCTATGTCCACGG 2 950 
GDATRMLTALAQAYVHG 
25 CGTCACCGTCGACTGGCCCGCCATCCTCGGCACCACCACAACCCGGGTAC 3000 
VTVDWPAI LGTTTTRV 
TGGACCTTCCGACCTACGCCTTCCAACACCAGCGGTACTGGCTCGAGTCG 3050 
LDLPTYAFQHQRYWLES 
GCTCCCCCGGCCACGGCCGACTCGGGCCACCCCGTCCTCGGCACCGGAGT 3100 
30 APPATADSGHPVLGTGV 

CGCCGTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTCCCGTGCCCGCCG 3150 

AVAGS PGRVFTGPVPA 
GTGCGGACCGCGCGGTGTTCATCGCCGAACTGGCGCTCGCCGCCGCCGAC 3200 
GADRAVFIAELALAAAD 
35 GCCACCGACTGCGCCACGGTCGAACAGCTCGACGTCACCTCCGTGCCCGG 3250 
ATDCATVEQLDVTSVPG 
CGGATCCGCCCGCGGCAGGGCCACCGCGCAGACCTGGGTCGATGAACCCG 3300 

GSARGRATAQTWVDEP 
CCGCCGACGGGCGGCGCCGCTTCACCGTCCACACCCGCGTCGGCGACGCC 3350 
40 AADGRRRFTVHTRVGDA 

CCGTGGACGCTGCACGCCGAGGGGGTTCTCCGCCCCGGCCGCGTGCCCCA 34 00 

PWTLHAEGVLRPGRVPQ 
GCCCGAAGCCGTCGACACCGCCTGGCCCCCGCCGGGCGCGGTGCCCGCGG 3450 
PEAVDTAWPPPGAVPA 
45 ACGGGCTGCCCGGGGCGTGGCGACGCGCGGACCAGGTCTTCGTCGAAGCC 3500 
DGLPGAWRRADQVFVEA 
GAAGTCGACAGCCCTGACGGCTTCGTGGCACACCCCGACCTGCTCGACGC 3550 

EVDSPDGFVAHPDLLDA 
GGTCTTCTCCGCGGTCGGCGACGGGAGCCGCCAGCCGACCGGATGGCGCG 3600 
50 VFSAVGDGSRQPTGWR 

ACCTCGCGGTGCACGCGTCGGACGCCACCGTGCTGCGCGCCTGCCTCACC 3650 
DLAVHAS DATVLRACLT 
CGCCGCGACAGTGGTGTCGTGGAGCTCGCCGCCTTCGACGGTGCCGGAAT 3700 
RRDSGVVELAAFDGAGM 
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GCCGGTGCTCACCGCGGAGTCGGTGACGCTGGGCGAGGTCGCGTCGGCAG 3750 

PVLTAE SVTLGEVASA 
GCGGATCCGACGAGTCGGACGGTCTGCTTCGGCTTGAGTGGTTGCCGGTG 38 00 
GGSDESDGLLRLEWLPV 
5 GCGGAGGCCCACTACGACGGTGCCGACGAGCTGCCCGAGGGCTACACCCT 3850 
AEAHYDGADELPEGYTL 
CATCACCGCCACACACCCCGACGACCCCGACGACCCCACCAACCCCCACA 3900 

ITATHPDDPDDPTNPH 
ACACACCCACACGCACCCACACACAAACCACACGCGTCCTCACCGCCCTC 3950 
10 NT PTRTHTQTTRVLTAL 

CAACACCACCTCATCACCACCAACCACACCCTCATCGTCCACACCACCAC 4 000 

QHHLITTNHTLIVHTTT 
CGACCCCCCAGGCGCCGCCGTCACCGGCCTCACCCGCACCGCACAAAACG 4050 
DPPGAAVTGLTRTAQN 
15 AACACCCCGGCCGCATCCACCTCATCGAAACCCACCACCCCCACACCCCA 4100 
EHPGRIHLIETHHPHTP 
CTCCCCCTCACCCAACTCACCACCCTCCACCAACCCCACCTACGCCTCAC 4150 

LPLTQLTTLHQPHLRLT 
CAACAACACCCTCCACACCCCCCACCTCACCCCCATCACCACCCACCACA 4200 
20 NNTLHTPHLTPITTHH 

ACACCACCACAACCACCCCCAACACCCCACCCCTCAACCCCAACCACGCC 4250 
NTTTTTPNTPPLNPNHA 
ATCCTCATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGCCA 4300 
I L I TGGSGTLAGILARH 
25 CCTCAACCACCCCCACACCTACCTCCTCTCCCGCACACCACCACCCCCCA 4350 
LNHPHTYLLSRTPPPP 
CCACACCCGGCACCCACATCCCCTGCGACCTCACCGACCCCACCCAAATC 4 4 00 
TTPGTHI PCDLTDPTQI 
ACCCAAGCCCTCACCCACATACCACAACCCCTCACCGGCATCTTCCACAC 44 50 
30 TQALTHI PQPLTGIFHT 

CGCCGCCACCCTCGACGACGCCACCCTCACCAACCTCACCCCCCAACACC 4500 

AATLDDATLTNLTPQH 
TCACCACCACCCTCCAACCCAAAGCCGACGCCGCCTGGCACCTCCACCAC 4 550 
LTTTLQPKADAAWHLHH 
35 CACACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCCGC 4 600 
HTQNQPLTHFVLYSSAA 
CGCCACCCTCGGCAGCCCCGGCCAAGCCAACTACGCCGCCGCCAACGCCT 4 650 

A T L G S P GQANYAAANA 
TCCTCGACGCCCTCGCCACCCACCGCCACACCCAAGGACAACCCGCCACC 4700 
40 FLDALATHRHTQGQPAT 

AC CAT CGCCTGGGG CAT G T GG C AC AC C ACC AC C ACAC T C AC C AG CC AAC T 4750 

TIAWGMWHTTTTLTSQL 
CACCGACAGCGACCGCGACCGCATCCGCCGCGGCGGCTTCCTGCCGATCT 4800 
Ac TDS DRDRIRRGGFLPI 

45 CGGACGACGAGGGCATGC 
S D D E G M 

Example 3 

Recombinant PKS Genes for 13-desmethoxy FK-506 and FK-520 
50 The present invention provides a variety of recombinant PKS genes in addition to 

those described in Examples 1 and 2 for producing 13-desmethoxy FK-506 and FK-520 
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compounds. This Example provides the construction protocols for recombinant FK-520 
and FK-506 (from Streptomyces sp. MA6858 (ATCC 55098), described in U.S. Patent 
Nos. 5,1 16,756, incorporated herein by reference) PKS genes in which the module 8 AT 
coding sequences have been replaced by either the rapAT3 (the AT domain from module 
3 of the rapamycin PKS), rapA!T2, erykll (the AT domain from module 1 of the 
erythromycin (DEBS) PKS), or ery AT2 coding sequences. Each of these constructs 
provides a PKS that produces the 13-desmethoxy-13-methyl derivative, except for the 
rapAT12 replacement, which provides the 13-desmethoxy derivative, i.e., it has a 
hydrogen where the other derivatives have methyl. 

Figure 7 shows the process used to generate the AT replacement constructs. First, 
a fragment of -4.5 kb containing module 8 coding sequences from the FK-520 cluster of 
ATCC 14891 was cloned using the convenient restriction sites Sad and Sphl (Step A in 
Figure 7). The choice of restriction sites used to clone a 4.0 - 4.5 kb fragment comprising 
module 8 coding sequences from other FK-520 or FK-506 clusters can be different 
depending on the DNA sequence, but the overall scheme is identical. The unique Sad 
and Sphl restriction sites at the ends of the FK-520 module 8 fragment were then changed 
to unique Bgl II and Nsil sites by ligation to synthetic linkers (described in the preceding 
Examples, see Step B of Figure 7). Fragments containing sequences 5' and 3' of the AT8 
sequences were then amplified using primers, described above, that introduced either an 
Avrll site or an Nhel site at two different KS/AT boundaries and an Xhol site at the 
AT/DH boundary (Step C of Figure 7). Heterologous AT domains from the rapamycin 
and erythromycin gene clusters were amplified using primers, as described above, that 
introduced the same sites as just described (Step D of Figure 7). The fragments were 
ligated to give hybrid modules with in-frame fusions at the KS/AT and AT/DH 
boundaries (Step E of Figure 7). Finally, these hybrid modules were ligated into the 
BarnRl and Pstl sites of the KC515 vector. The resulting recombinant phage were used to 
transform the FK-506 and FK-520 producer strains to yield the desired recombinant cells, 
as described in the preceding Examples. 
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The following table shows the location and sequences surrounding the engineered 
site of each of the heterologous AT domains employed. The FK-506 hybrid construct was 
used as a control for the FK-520 recombinant cells produced, and a similar FK-520 
hybrid construct was used as a control for the FK-506 recombinant cells. 



Heterologous AT 


Enzyme 


Location of Engineered Site 


FK-506 AT8 


AvrW 


GGCCGTccgcgcCGTGCGGCGGTCTCGTCGTTC 


(hydro xymalonyl) 


lynci 


GRPRRAAVSSF 
ACCCAGCATCCCGCGATGGGTGAGCGgctcgcC 




TQHPAMGERLA 




Xhol 


TACGCCTTCCAGCGGCGGCCCTACTGGatcgag 
YAFQRRPYWIE 


rapamycin AT3 
(methylmalonyl) 


Avrll 


GACCGGccccgtCGGGCGGGCGTGTCGTCCTTC 


l\rl6l 


DRPRRAGVSSF 
TGGCAGTGGCTGGGGATGGGCAGTGCcctgcgG 




WQWLGMGSALR 




Xhol 


TACGCCTTCCAACACCAGCGGTACTGGgtcgag 
YAFQHQRYWVE 


rapamycin AT 12 
(malonyl) 


Avrll 


GGCCGAgcgcgcCGGGCAGGCGTGTCGTCCTTC 


Nhel 


GRARRAGVSSF 
TCGCAGCGTGCTGGCATGGGTGAGGAactggcC 




SQRAGMGEELA 
TACGCCTTCCAGCACCAGCGCTACTGGctcgag 




Xhol 


YAFQHQRYWLE 


DEBS ATI 


Avrll 


GCGCGAccgcgcCGGGCGGGGGTCTCGTCGTTC 


(methylmalonyl) 


Nhel 


ARPRRAGVSSF 
TGGCAGTGGGCGGGCATGGCCGTCGAcctgctC 




WQWAGMAVDLL 
TACCCGTTCCAGCGCGAGCGCGTCTGGctcgaa 




Xhol 


YPFQRERVWLE 


DEBS AT2 


Avrll 


GACGGGgtgcgcCGGGCAGGTGTGTCGGCGTTC 


(methylmalonyl) 


Nhel 


DGVRRAGVSAF 
GCCCAGTGGGAAGGCATGGCGCGGGAgttgttG 




AQWEGMARELL 




Xhol 


TATCCTTTCCAGGGCAAGCGGTTCTGGctgctg 
YPFQGKRFWLL 



dc- 176500 



PATENT 

AttyDkt: 300622002600 



-126- 

The sequences shown below provide the location of the KS/AT boundaries 
chosen in the FK-520 module 8 coding sequences. Regions where Avrll and Nhel sites 
were engineered are indicated by lower case and underlining. 

CCGGCGCCGTCGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGG ccacgg C 
5 AGAVELLTSARPWPETDRPR 

GTGCCGCCGTCTCCTCGTTCGGGGTGAGCGGCACCAACGCCCACGTCATCCTGGAGGCCG 
RAAVSS FGVSGTNAHVILEA 
GACCGGTAACGGAGACGCCCGCGGCATCGCCTTCCGGTGACCTTCCCCTGCTGGTGTCGG 
GPVTETPAASPSGDLPLLVS 

10 CACGCTCACCGGAAGCGCTCGACGAGCAGATCCGCCGACTGCGCGCCTACCTGGACACCA 
ARS PEALDEQIRRLRAYLDT 
CCCCGGACGTCGACCGGGTGGCCGTGGCACAGACGCTGGCCCGGCGCACACACTTCGCCC 
T PDVDRVAVAQTLARRTHFA 
ACCGCGCCGTGCTGCTCGGTGACACCGTCATCACCACACCCCCCGCGGACCGGCCCGACG 

15 HRAVLLGDTVI TTPPADRPD 

AACTCGTCTTCGTCTACTCCGGCCAGGGCACCCAGCATCCCGCGATGGGCGAGC Aqctcg 
ELVFVYSGQGTQHPAMGEQL 
cCGCCGCCCATCCCGTGTTCGCCGACGCCTGGCATGAAGCGCTCCGCCGCCTTGACAACC 
AAAHPVFADAWHEALRRLDN 

20 

The sequences shown below provide the location of the AT/DH boundary chosen 
in the FK-520 module 8 coding sequences. The region where an Xhol site was 
engineered is indicated by lower case and underlining. 

TCCTCGGGGCTGGGTCACGGCACGACGCGGATGTGCCCGCGTACGCGTTCCAACGGCGGC 
25 I LGAGS RHDADVPAYAFQRR 

ACTACTGG atcgag TCGGCACGCCCGGCCGCATCCGACGCGGGCCACCCCGTGCTGGGCT 
HYWIESARPAASDAGHPVLG 

The sequences shown below provide the location of the KS/AT boundaries 
30 chosen in the FK-506 module 8 coding sequences. Regions where ^4vrII and Nhel sites 
were engineered are indicated by lower case and underlining. 

TCGGCCAGGCCGTGGCCGCGGACCGGCCGT ccgcgc CGTGCGGCGGTCTCGTCGTTCGGG 

SARPWPRTGRPRRAAVSSFG 
GTGAGCGGCACCAACGCCCACATCATCCTGGAGGCCGGACCCGACCAGGAGGAGCCGTCG 
35 VSGTNAHI ILEAGPDQEEPS 
GCAGAACCGGCCGGTGACCTCCCGCTGCTCGTGTCGGCACGGTCCCCGGAGGCACTGGAC 

AEPAGDLPLLVSARSPEALD 
GAGCAGATCGGGCGCCTGCGCGACTATCTCGACGCCGCCCCCGGCGTGGACCTGGCGGCC 
EQIGRLRDYLDAAPGVDLAA 
40 GTGGCGCGGACACTGGCCACGCGTACGCACTTCTCCCACCGCGCCGTACTGCTCGGTGAC 
VARTLATRTHFSHRAVLLGD 
ACCGTCATCACCGCTCCCCCCGTGGAACAGCCGGGCGAGCTCGTCTTCGTCTACTCGGGA 

TVI TAPPVEQPGELVFVYSG 
CAGGGCACCCAGCATCCCGCGATGGGTGAGCG gctcgc CGCAGCCTTCCCCGTGTTCGCC 
45 QGTQHPAMGERLAAAFPVFA 
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GACCCGGACGTACCCGCCTACGCCTTCCAGCGGCGGCCCTACTGGATCGAGTCCGCGCCG 
DPDVPAYAFQRRPYWI ESAP 

The sequences shown below provide the location of the AT/DH boundary chosen 
5 in the FK-506 module 8 coding sequences. The region where an Xhol site was 
engineered is indicated by lower case and underlining. 

GACCCGGACGTACCCGCCTACGCCTTCCAGCGGCGGCCCTACTGG atcgag TCCGCGCCG 
DPDVPAYAFQRRPYWIESAP 

10 Example 4 

Replacement of Methoxyl with Hydrogen or Methyl at C-15 of FK-506 and FK-520 
The methods and reagents of the present invention also provide novel FK-506 and 
FK-520 derivatives in which the methoxy group at C-15 is replaced by a hydrogen or 
methyl. These derivatives are produced in recombinant host cells of the invention that 

1 5 express recombinant PKS enzymes the produce the derivatives. These recombinant PKS 
enzymes are prepared in accordance with the methodology of Examples 1 and 2, with the 
exception that AT domain of module 7, instead of module 8, is replaced. Moreover, the 
present invention provides recombinant PKS enzymes in which the AT domains of both 
modules 7 and 8 have been changed. The table below summarizes the various compounds 

20 provided by the present invention. 



Compound 


C-13 


C-15 


FK-506 


hydrogen 


hydrogen 


FK-506 


hydrogen 


methoxy 


FK-506 


hydrogen 


methyl 


FK-506 


methoxy 


hydrogen 


FK-506 


methoxy 


methoxy 


FK-506 


methoxy 


methyl 


FK-506 


methyl 


hydrogen 


FK-506 


methyl 


methoxy 


FK-506 


methyl 


methyl 


FK-520 


hydrogen 


hydrogen 



Derivative Provided 

13,1 5-didesmethoxy-FK-506 

1 3 -desmethoxy-FK-506 

13,1 5-didesmethoxy- 1 5-methyl-FK-506 

1 5-desmethoxy-FK-506 

Original Compound - FK-506 

1 5-desmethoxy- 1 5-methyl-FK-506 

13,1 5-didesmethoxy- 1 3 -methyl-FK-506 

13-desmethoxy-13-methyl-FK-506 

13,1 5-didesmethoxy-l 3, 1 5-dimethyl-FK-506 

13, 1 5-didesmethoxy FK-520 
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FK-520 


hydrogen 


methoxy 


13-desmethoxy FK-520 


FK-520 


hydrogen 


methyl 


1 3,1 5-didesmethoxy-l 5-methyl-FK-520 


FK-520 


methoxy 


hydrogen 


15-desmethoxy-FK-520 


FK-520 


methoxy 


methoxy 


Original Compound - FK-520 


FK-520 


methoxy 


methyl 


1 5-desmethoxy- 1 5-methyl-FK-520 


FK-520 


methyl 


hydrogen 


13,15-didesmethoxy-13-methyl-FK-520 


FK-520 


methyl 


methoxy 


1 3-desmethoxy-l 3-methyl-FK-520 


FK-520 


methyl 


methyl 


13,1 5-didesmethoxy- 13,1 5-dimethyl-FK-520 



Example 5 

Replacement of Methoxyl with Ethyl at C-13 and/or C-15 of FK-506 and FK-520 
The present invention also provides novel FK-506 and FK-520 derivative 
compounds in which the methoxy groups at either or both the C-13 and C-15 positions 
are instead ethyl groups. These compounds are produced by novel PKS enzymes of the 
invention in which the AT domains of modules 8 and/or 7 are converted to ethylmalonyl 
specific AT domains by modification of the PKS gene that encodes the module. 
Ethylmalonyl specific AT domain coding sequences can be obtained from, for example, 
the FK-520 PKS genes, the niddamycin PKS genes, and the tylosin PKS genes. The 
novel PKS genes of the invention include not only those in which either or both of the 
AT domains of modules 7 and 8 have been converted to ethylmalonyl specific AT 
domains but also those in which one of the modules is converted to an ethylmalonyl 
specific AT domain and the other is converted to a malonyl specific or a methylmalonyl 
specific AT domain. 

Example 6 
Neurotrophic Compounds 
The compounds described in Examples 1 - 4, inclusive have immunosuppressant 
activity and can be employed as immunosuppressants in a manner and in formulations 
similar to those employed for FK-506. The compounds of the invention are generally 
effective for the prevention of organ rejection in patients receiving organ transplants and 
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in particular can be used for immunosuppression following orthotopic liver 
transplantation. These compounds also have pharmacokinetic properties and metabolism 
that are more advantageous for certain applications relative to those of FK-506 or FK- 
520. These compounds are also neurotrophic; however, for use as neurotrophic, it is 
5 desirable to modify the compounds to diminish or abolish their immunosuppressant 
activity. This can be readily accomplished by hydroxylating the compounds at the C-18 
position using established chemical methodology or novel FK-520 PKS genes provided 
by the present invention. 

Thus, in one aspect, the present invention provides a method for stimulating nerve 

10 growth that comprises administering a therapeutically effective dose of 1 8-hydroxy-FK- 
520. In another embodiment, the compound administered is a C-18,20-dihydroxy-FK-520 
derivative. In another embodiment, the compound administered is a C-13-desmethoxy 
and/or C-15-desmethoxy 18-hydroxy-FK-520 derivative. In another embodiment, the 
compound administered is a C-13-desmethoxy and/or C-15-desmethoxy 18,20- 

15 dihydroxy-FK-520 derivative. In other embodiments, the compounds are the 

corresponding analogs of FK-506. The 18-hydroxy compounds of the invention can be 
prepared chemically, as described in U.S. Patent No. 5,189,042, incorporated herein by 
reference, or by fermentation of a recombinant host cell provided by the present invention 
that expresses a recombinant PKS in which the module 5 DH domain has been deleted or 

20 rendered non-functional. 

The chemical methodology is as follows. A compound of the invention (-200 mg) 
is dissolved in 3 mL of dry methylene chloride and added to 45 of 2,6-lutidine, and 
the mixture stirred at room temperature. After 10 minutes, tert-butyldimethylsilyl 
trifluoromethanesulfonate (64 |iL) is added by syringe. After 15 minutes, the reaction 

25 mixture is diluted with ethyl acetate, washed with saturated bicarbonate, washed with 
brine, and the organic phase dried over magnesium sulfate. Removal of solvent in vacuo 
and flash chromatography on silica gel (ethyl acetate: hexane (1:2) plus 1% methanol) 
gives the protected compound, which is dissolved in 95% ethanol (2.2 mL) and to which 
is added 53 jiL of pyridine, followed by selenium dioxide (58 mg). The flask is fitted 

30 with a water condenser and heated to 70°C on a mantle. After 20 hours, the mixture is 
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cooled to room temperature, filtered through diatomaceous earth, and the filtrate poured 
into a saturated sodium bicarbonate solution. This is extracted with ethyl acetate, and the 
organic phase is washed with brine and dried over magnesium sulfate. The solution is 
concentrated and purified by flash chromatography on silica gel (ethyl acetate: hexane 
5 (1 :2) plus 1% methanol) to give the protected 18-hydroxy compound. This compound is 
dissolved in acetonitrile and treated with aqueous HF to remove the protecting groups. 
After dilution with ethyl acetate, the mixture is washed with saturated bicarbonate and 
brine, dried over magnesium sulfate, filtered, and evaporated to yield thel8-hydroxy 
compound. Thus, the present invention provides the C-18-hydroxyl derivatives of the 

10 compounds described in Examples 1 - 4. 

Those of skill in the art will recognize that other suitable chemical procedures can 
be used to prepare the novel 18-hydroxy compounds of the invention. See, e.g., Kawai et 
a/., Jan. 1993, Structure-activity profiles of macrolactam immunosuppressant FK-506 
analogues, FEBS Letters 316(2): 107-1 13, incorporated herein by reference. These 

15 methods can be used to prepare both the C18-[5]-OH and C18-[i?]-OH enantiomers, with 
the R enantiomer showing a somewhat lower ICso, which may be preferred in some 
applications. See Kawai et al. 9 supra. Another preferred protocol is described in Umbreit 
and Sharpless, 1977, JACS 99(16): 1526-28, although it may be preferable to use 30 
equivalents each of SeC>2 and t-BuOOH rather than the 0.02 and 3-4 equivalents, 

20 respectively, described in that reference. 

All scientific and patent publications referenced herein are hereby incorporated by 
reference. The invention having now been described by way of written description and 
example, those of skill in the art will recognize that the invention can be practiced in a 
variety of embodiments, that the foregoing description and example is for purposes of 

25 illustration and not limitation of the following claims. 
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